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To Marjorie 


Preface 


[Hilbert’s] style has not the terseness of many of our modern authors 
in mathematics, which is based on the assumption that printer’s labor 
and paper are costly but the reader’s effort and time are not. 


H. Weyl [143] 


The purpose of this book is to describe the classical problems in additive number 
theory and to introduce the circle method and the sieve method, which are the 
basic analytical and combinatorial tools used to attack these problems. This book 
is intended for students who want to learn additive number theory, not for experts 
who already know it. For this reason, proofs include many “unnecessary” and 
“obvious” steps; this is by design. 

The archetypical theorem in additive number theory is due to Lagrange: Every 
nonnegative integer is the sum of four squares. In general, the set A of nonnegative 
integers is called an additive basis of order h if every nonnegative integer can be 
written as the sum of h not necessarily distinct elements of A. Lagrange’s theorem 
is the statement that the squares are a basis of order four. The set A is called a 
basis of finite order if A is a basis of order h for some positive integer h. Additive 
number theory is in large part the study of bases of finite order. The classical bases 
are the squares, cubes, and higher powers; the polygonal numbers; and the prime 
numbers. The classical questions associated with these bases are Waring’s problem 
and the Goldbach conjecture. 

Waring’s problem is to prove that, for every k > 2, the nonnegative kth powers 
form a basis of finite order. We prove several results connected with Waring’s 
problem, including Hilbert’s theorem that every nonnegative integer is the sum of 


Vill Preface 


a bounded number of kth powers, and the Hardy—Littlewood asymptotic formula 
for the number of representations of an integer as the sum of s positive kth powers. 

Goldbach conjectured that every even positive integer is the sum of at most 
two prime numbers. We prove three of the most important results on the Gold- 
bach conjecture: Shnirel’man’s theorem that the primes are a basis of finite order, 
Vinogradov’s theorem that every sufficiently large odd number is the sum of three 
primes, and Chen’s theorem that every sufficently large even integer is the sum of 
a prime and a number that is a product of at most two primes. 

Many unsolved problems remain. The Goldbach conjecture has not been proved. 
There is no proof of the conjecture that every sufficiently large integer is the sum 
of four nonnegative cubes, nor can we obtain a good upper bound for the least 
number s of nonnegative kth powers such that every sufficiently large integer 
is the sum of s kth powers. It is possible that neither the circle method nor the 
sieve method is powerful enough to solve these problems and that completely 
new mathematical ideas will be necessary, but certainly there will be no progress 
without an understanding of the classical methods. 

The prerequisites for this book are undergraduate courses in number theory and 
real analysis. The appendix contains some theorems about arithmetic functions 
that are not necessarily part of a first course in elementary number theory. In a 
few places (for example, Linnik’s theorem on sums of seven cubes, Vinogradov’s 
theorem on sums of three primes, and Chen’s theorem on sums of a prime and an 
almost prime), we use results about the distribution of prime numbers in arithmetic 
progressions. These results can be found in Davenport’s Multiplicative Number 
Theory [19]. 

Additive number theory is a deep and beautiful part of mathematics, but for 
too long it has been obscure and mysterious, the domain of a small number of 
specialists, who have often been specialists only in their own small part of additive 
number theory. This 1s the first of several books on additive number theory. I hope 
that these books will demonstrate the richness and coherence of the subject and 
that they will encourage renewed interest in the field. 

I have taught additive number theory at Southern Illinois University at Carbon- 
dale, Rutgers University—New Brunswick, and the City University of New York 
Graduate Center, and I am grateful to the students and colleagues who participated 
in my graduate courses and seminars. I also wish to thank Henryk Iwaniec, from 
whom I learned the linear sieve and the proof of Chen’s theorem. 

This work was supported in part by grants from the PSC-CUNY Research Award 
Program and the National Security Agency Mathematical Sciences Program. 

I would very much like to receive comments or corrections from readers of this 
book. My e-mail addresses are nathansn@alpha.lehman.cuny.edu and nathanson@ 
worldnet.att.net. A list of errata will be available on my homepage at http://www. 
lehman.cuny.edu or http://math.lehman.cuny.edu/nathanson. 


Melvyn B. Nathanson 
Maplewood, New Jersey 
May 1, 1996 
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Notation and conventions 


Theorems, lemmas, and corollaries are numbered consecutively in each chapter 
and in the Appendix. For example, Lemma 2.1 is the first lemma in Chapter 2 and 
Theorem A.2 is the second theorem in the Appendix. 

The lowercase letter p denotes a prime number. 

We adhere to the usual convention that the empty sum (the sum containing no 
terms) is equal to zero and the empty product is equal to one. 

Let f be any real or complex-valued function, and let g be a positive function. 
The functions f and g can be functions of a real variable x or arithmetic functions 
defined only on the positive integers. We write 


f = O(g) 
or 
f<«g 
or 
gs>f 


if there exists a constant c > O such that 


|f(x)| < cg(x) 


for all x in the domain of f. The constant c is called the implied constant . We 
write 

f Kab... & 
if there exists a constant c > 0 that depends ona, b, ... such that 


| f(x) < cg(x) 


XIV Notation and conventions 


for all x in the domain of f. We write 


if 


The function f is asymptotic to g, denoted 
f~ 8, 


lim J) =]. 
X00 g(x) 
The real-valued function f is increasing on the interval J if f(x) < f(%2) for all 
X1,X2 € I with x; < x2. Similarly, the real-valued function f is decreasing on 
the interval J if f(x;) => f (x2) for all x;, x2 € J with x; < x2. The function f is 
monotonic on the interval / if it is either increasing on 7 or decreasing on /. 
We use the following notation for exponential functions: 


exp(x) = e* 


and | 
e(x) = exp(27ix) = e77"*, 


The following notation is standard: 


Z the integers 0, +1, +£2,... 

R the real numbers 

R’ n-dimensional Euclidean space 

Zz" the integer lattice in R” 

C the complex numbers 

|z| the absolute value of the complex number z 
Kz the real part of the complex number z 

DZ the imaginary part of the complex number z 
[x] the integer part of the real number x, 


that is, the integer uniquely determined 
by the inequality [x] < x < [x] +1. 


{x} the fractional part of the real number x, 
that is, {x} = x — [x] € [0, 1). 
|| x || the distance from the real number x 


to the nearest integer, that is, 
|x || = min{|x — n|:n € Z} = min ({x}, 1 — {x}) € [0, 1/2]. 


(a;,...,@,) the greatest common divisor of the integers a), ..., ay 
[a1,...,@,] the least common multiple of the integers a), ..., ap 
|X| the cardinality of the set X 


hA the h-fold sumset, consisting of all sums of h elements of A 


Part I 


Waring’s problem 


I 


Sums of polygons 


Imo propositionem pulcherrimam et maxime generalem nos primi de- 
teximus: nempe omnem numerum vel esse triangulum vex ex duobus 
aut tribus triangulis compositum: esse quadratum vel ex duobus aut 
tribus aut quatuorquadratis compositum: esse pentagonum vel ex duo- 
bus, tribus, quatuor aut quinque pentagonis compositum; et sic dein- 
ceps in infinitum, in hexagonis, heptagonis polygonis quibuslibet, 
enuntianda videlicet pro numero angulorum generali et mirabili pro- 
postione. Ejus autem demonstrationem, quae ex multis variis et abstru- 
sissimis numerorum mysteriis derivatur, hic apponere non licet. .. .! 


P. Fermat [39, page 303] 


'T have discovered a most beautiful theorem of the greatest generality: Every number 
is a triangular number or the sum of two or three triangular numbers; every number is a 
square or the sum of two, three, or four squares; every number is a pentagonal number or 
the sum of two, three, four, or five pentagonal numbers; and so on for hexagonal numbers, 
heptagonal numbers, and all other polygonal numbers. The precise statement of this very 
beautiful and general theorem depends on the number of the angles. The theorem is based 
on the most diverse and abstruse mysteries of numbers, but I am not able to include the 
proof here. ... 


4 1. Sums of polygons 


1.1 Polygonal numbers 


Polygonal numbers are nonnegative integers constructed geometrically from the 
regular polygons. The triangular numbers, or triangles, count the number of points 
in the triangular array 


The sequence of triangles is 0, 1, 3,6, 10, 15,.... 
Similarly, the square numbers count the number of points in the square array 


The sequence of squares is 0, 1, 4,9, 16, 25,.... 
The pentagonal numbers count the number of points in the pentagonal array 


The sequence of pentagonal numbers is 0, 1,5, 12, 22, 35, .... There is a similar 
sequence of m-gonal numbers corresponding to every regular polygon with m 
sides. 

Algebraically, forevery m > 1, the kth polygonal number of order m+2, denoted 
Pm(k), is the sum of the first k terms of the arithmetic progression with initial value 
1 and difference m, that is, 


Pm(k) = 1+(m+1)+ (2m +1)+---+((k — lm +1) 


_ mk(k — 1) +k. 
2 
This is a quadratic polynomial in k. The triangular numbers are the numbers 
k(k + 1) 
Pilk) = 


2 b] 
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the squares are the numbers 
po(k) =k’, 


the pentagonal numbers are the numbers 


pa(k) = =<. 
and so on. This notation is awkward but traditional. 

The epigraph to this chapter is one of the famous notes that Fermat wrote in 
the margin of his copy of Diophantus’s Arithmetica. Fermat claims that, for every 
m > 1, every nonnegative integer can be written as the sum of m + 2 polygonal 
numbers of order m + 2. This was proved by Cauchy in 1813. The goal of this 
chapter is to prove Cauchy’s polygonal number theorem. We shall also prove the 
related result of Legendre that, for every m > 3, every sufficiently large integer is 
the sum of five polygonal numbers of order m + 2. 


1.2 Lagrange’s theorem 


We first prove the polygonal number theorem for squares. This theorem of La- 
grange is the most important result in additive number theory. 


Theorem 1.1 (Lagrange) Every nonnegative integer is the sum of four squares. 
Proof. It is easy to check the formal polynomial identity 
(x? +x3 +x} + xt)(y? + y3 + y3 + y2) = Zz + z + Za + oe (1.1) 


where 


Z] X1Y1 + X22 + XZY3 + X4Y4 


Z2 = X12 — X21 — X34 + X4Y3 (1.2) 
73 = X1Y3 — X31 + X2V4 — X4y2 
24 = X14 — X4yYi — X23 + X32 


This implies that if two numbers are both sums of four squares, then their product 
is also the sum of four squares. Every nonnegative integer is the product of primes, 
so it suffices to prove that every prime number is the sum of four squares. Since 
2 = 17+ 17 +0? +07, we consider only odd primes p. 

The set of squares 


{a* |a=0,1,...,(p — 1)/2} 


represents (p + 1)/2 distinct congruence classes modulo p. Similarly, the set of 
integers 
{-b? -1|b=0,1,...,(p — 1)/2} 


6 1. Sums of polygons 


represents (p + 1)/2 distinct congruence classes modulo p. Since there are only 
p different congruence classes modulo p, by the pigeonhole principle there must 
exist integers a and b such that 0 < a, b < (p — 1)/2 and 


a’ =—b*—1 (mod p), 
that is, 

a’ +b*>+1=0 (mod p). 
Let a* +b? + 1 = np. Then 


~—1\2 
psnp=a++i4e <2(P=*) t1<F+l<p’, 
and so 


1<n<p. 


Let m be the least positive integer such that mp is the sum of four squares. Then 
there exist integers x;, x2, x3, x4 such that 


mp = Xi +x +x5 4X2 
and 
l1<m<n<p. 


We must show that m = 1. 
Suppose not. Then 1 < m < p. Choose integers y; such that 


y; =x; (mod m) 


and 
—m/2< y, <m/2 


fori =1,...,4. Then 
yityetyet ye axptxp+xjt+xj=mp =0 (mod m) 
and 
mr = yi + yy +3 4Ya 


for some nonnegative integer r. If r = 0, then y; = 0 for all i and each x? is divisible 
by m?. It follows that mp is divisible by m7, and so p is divisible by m. This is 
impossible, since p is prime and 1 < m < p. Therefore, r > 1 and 


mr = yi +3 +3 +¥4 < A(m/2) =m’. 


Moreover, r = m if and only if m is even and y; = m/2 for all i. In this case, 
x; =m/2 (mod m) for alli, and so x? = (m/2)?_ (mod m7”) and 


mp = x? +x} +x? +x2 = 4(m /2)° =m*=0 (mod m’). 


1.3 Quadratic forms 7 


This implies that p is divisible by m, which is absurd. Therefore, 
l<r<m. 
Applying the polynomial identity (1.1), we obtain 


mrp = (mp)(mr) 
= (x? +x} +x? +xi(y; + ys + y3 + yz) 


2 
= 2+ +5 +74; 


where the z; are defined by equations (1.2). Since x; = y; (mod m), these 
equations imply that z; = 0 (mod m) fori = 1,...,4. Let w; = z;/m. Then 
W1,..., W4 are integers and 


rp = Ww? +w; + ws; + we, 


which contradicts the minimality of m. Therefore, m = 1 and the prime p is the 
sum of four squares. This completes the proof of Lagrange’s theorem. 

A set of integers is called a basis of order h if every nonnegative integer can be 
written as the sum of h not necessarily distinct elements of the set. A set of integers 
is called a basis of finite order if the set is a basis of order h for some h. Lagrange’s 
theorem states that the set of squares is a basis of order four. Since 7 cannot be 
written as the sum of three squares, it follows that the squares do not form a basis 
of order three. The central problem in additive number theory is to determine if a 
given set of integers is a basis of finite order. Lagrange’s theorem gives the first 
example of a natural and important set of integers that is a basis. In this sense, it 
is the archetypical theorem in additive number theory. Everything in this book is a 
generalization of Lagrange’s theorem. We shall prove that the polygonal numbers, 
the cubes and higher powers, and the primes are all bases of finite order. These are 
the classical bases in additive number theory. 


1.3. Quadratic forms 


Let A = (qa;,;) be an m x n matrix with integer coefficients. In this chapter, we 
shall only consider matrices with integer coefficients. Let A’ denote the transpose 


of the matrix A, that is, A’ = (a7 i) is the n x m matrix such that 


Qj j = 4, 
fori =1,...,n and j = 1,...,m.Then(A’)! = A for every m x n matrix A, 
and (AB)? = B’ A’ for any pair of matrices A and B such that the number of 
columns of A is equal to the number of rows of B. 

Let M,,(Z) be the ring of n x n matrices. A matrix A € M,,(Z) is symmetric if 
A’ =A. If A is a symmetric matrix and U is any matrix in M,,(Z), then U7 AU is 


also symmetric, since 


(UT AU) =U' Al (UT) =U' AU. 
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Let SL,(Z) denote the group of n x n matrices of determinant 1. This group acts 
on the ring M,,(Z) as follows: If A € M,(Z) and U € SL,(Z), we define 


A-U=U'AU. 
This is a group action, since 
A-(UV) =(UV)' AUV) =V'(U" AU)V =(U' AU) -V =(A-U)-V. 
We say that two matrices A and B in M,,(Z) are equivalent, denoted 
A~B, 


if A and B lie in the same orbit of the group action, that is, if B= A-U =U? AU 
for some U € SL,,(Z). It is easy to check that this is an equivalence relation. Since 
det(U) = 1 for all U € SL,(Z), it follows that 


det(A - U) = det(U’ AU) = det(U" ) det(A) det(U) = det(A) 


for all A € M,,(Z), and so the group action preserves determinants. Also, if A is 
symmetric, then A - U is also symmetric. Thus, for any integer d, the group action 
partitions the set of symmetric n x n matrices of determinant d into equivalence 
classes. 

To every n x n symmetric matrix A = (qa; ;) we associate the quadratic form F 4 


defined by 
n n 
F4(%,.--,Xn) = > Sai, jxix;. 
i=1 j=l 
This is a homogeneous function of degree two in the n variables x), ..., x,. For 


example, if J, is the n x n identity matrix, then the associated quadratic form is 
Fy (%1,---5Xn) = x7 4x24... +x?, 


Let x denote the n x 1 matrix (or column vector) 


We can write the quadratic form in matrix notation as follows: 
F4(x1, wey Xn) = x! Ax. 


The discriminant of the quadratic form F’, is the determinant of the matrix A. Let 
A and B ben x n symmetric matrices, and let F4 and Fg be their corresponding 
quadratic forms. We say that these forms are equivalent, denoted 


Fa, ~ Fp, 
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if the matrices are equivalent, thatis,if A ~ B. Equivalence of quadratic forms is an 
equivalence relation, and equivalent quadratic forms have the same discriminant. 
The quadratic form F4 represents the integer N if there exist integers x1, ..., Xp 
such that 
F4(xX1,..-,%n)=N. 


If F, ~ Fp, then A ~ B and there exists a matrix U € SL, (Z) such that 
A=B-U =U’ BU. It follows that 


F,(x) =x! Ax =x'U! BUx = (Ux)! B(Ux) = Fp(UX). 


Thus, if the quadratic form F4 represents the integer NV, then every form equivalent 
to F,4 also represents N. Since equivalence of quadratic forms is an equivalence 
relation, it follows that any two quadratic forms in the same equivalence class 
represent exactly the same set of integers. Lagrange’s theorem implies that, for 
n > 4, any form equivalent to the form x? + --- + x? represents all nonnegative 
integers. 

The quadratic form F', is called positive-definite if F4(x1,...,Xn) > 1 for all 
(x1,...,Xn) # (O,..., 0). Every form equivalent to a positive-definite quadratic 
form is positive-definite. 

A quadratic form in two variables is called a binary quadratic form. A quadratic 
form in three variables is called a ternary quadratic form. For binary and ternary 
quadratic forms, we shall prove that there is only one equivalence class of positive- 
definite forms of discriminant 1. We begin with binary forms. 


_f 4,1 41,2 
Q\2 422 


be a2 x 2 symmetric matrix, and let 


Lemma 1.1 Let 


2 2 
Fa(X1, X2) = @11X7 + 201,2%1X2 + A2,2X5 


be the associated quadratic form. The binary quadratic form F 4 is positive-definite 
if and only if 
ayi 21 


and the discriminant d satisfies 
d = det(A) = 1,142.2 — aj, >1. 
Proof. If the form F 4 is positive-definite, then 
F4(1,0) =a;,; = 1 
and 
Fi(— _ 2 9 2 2 
A(—41,2, 41,1) = 41,14; 9 — 2,1; 9 + A; 2,2 


2 
= a},1 (ay,142,2 —- ay >) 
=a,id > 1, 
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and so d > 1. Conversely, if a}; > 1 andd > 1, then 
a1. F4(%1, X2) = (a1,1.41 + 41,2%2)" + dxz > 0, 
and F'4(x1, X2) = 0 if and only if (x;, x2) = (0, 0). This completes the proof. 


Lemma 1.2 Every equivalence class of positive-definite binary quadratic forms 
of discriminant d contains at least one form 


2 2 
Fa(X1, X2) = Q1,1X7 + 21 ,2X1xX2 + A2,2X5 


for which 


2 
2|a1,2| <ai1 < Waka 


Proof. Let Fg (x1, x2) = b1,1x? + 2b1,2%1x2 + b2,2x? be a positive-definite quad- 
ratic form, where 
bi1 bi 
Ba( Chl OL 
( bi2 22 


is the 2 x 2 symmetric matrix associated with F’. Let a; ; be the smallest positive 
integer represented by F’. Then there exist integers r;, rz such that 


F(r,, 72) =a}. 


If the positive integer h divides both r; and r2, then, by the homogeneity of the 
form and the minimality of a;.;, we have 


F(r,72) =}, 
aii< F(rn/h, r2/h) = “ 
and so h = 1. Therefore, (7;, r2) = 1 and there exist integers s; and sz such that 


1 =7r)S2 —r28; =7r,(s2 + 7r2t) — r2(s} +1rjt) 


for all integers t. Then 


for all t € Z. Let 


A=U’ BU 
-( F(rj, 12) Q,.+ F(rn, ret ) 


Qiot F(ri,ro)t F(s) +rit, 52 +7rat) 


_f 4,1 41,2 
_ 9 
412 422 
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where 


/ 
Ay. = by 178; + bi 2(71 $2 + 17251) + b2,27252 
/ 
a12=a4)2 +a, 10 


a22 = F(s, +r\t, 52 +7ret) >a, 


since (S; + 7 t, Sz + rot) # (0,0) for all t € Z, and a,,; is the smallest positive 
number represented by the form F. Since {a, , + a),:t : t € Z} is a congruence 
class modulo a; ;, we can choose t¢ so that 


/ a}, 
la;,2| = la). +a1,1t| < a 


Then A ~ B, and the form Fx is equivalent to the form F4(x, x2) = a),1x? + 
201,2X1X2 + a2,2x%, where 


2|a1,2| < 41,1 < 22. 
If d is the discriminant of the form, then 


2 
d = },142,2 — aj 9, 


and the inequality 
ayy 
ayy < @) 1422 = d +a; 5 < d+ 71 
implies that 
3a? 
Ml cg 
4 
or, equivalently, 
2 
ais Ave 


This completes the proof. 


Theorem 1.2 Every positive-definite binary quadratic form of discriminant 1 is 
equivalent to the form x? + x3. 


Proof. Let F be a positive-definite binary quadratic form of discriminant 1. By 
Lemma 1.2, the form F is equivalent to a form a) x} +2a,,2x1x2+d2,2x? for which 


2 
2|a1,2| < ai < FR < 2. 


Since a;,; > 1, we must have a,,,; = 1. This implies that a; = 0. Since the 
discriminant is 1, we have 


2 
42,2 = 41,1422 — ai2 = 1. 


Thus, the form F is equivalent to x? + x2. This completes the proof. 
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1.4 Ternary quadratic forms 

We shall now prove an analogous result for positive-definite ternary quadratic 
forms. 


Lemma 1.3 Let 
Qi1 41,2 413 
A=] 412 422 423 
41,3 42,3 433 


be a3 x 3 symmetric matrix, and let F4 be the corresponding ternary quadratic 
form. Let d be the discriminant of F 4. Then 


2 
Q1,1 F4(X1, X2, X3) = (A1,1X1 + A1,2X2 + A1,3X3)~ + Ga+(X2, x3), (1.3) 


where G a+ is the binary quadratic form corresponding to the matrix 


2 

At = @1,142,2 — @y 4 Q1,142,3 — Q1,281,3 (1.4) 

= - . 
@1,142,3 — @1,2@1,3 @},143,3 — Ay 3 


and G4 has discriminant a,,\d. If F4 is positive-definite, then G4» is positive- 
definite. Moreover, the form F 4 is positive-definite if and only if the following three 
determinants are positive: 


ay, = det(a;,1) = 1, 


and 
d = det(A) > 1. 


Proof. We obtain identities (1.3) and (1.4) as well as the discriminant of G 4+ 
by straightforward calculation. 
If F'4 is positive-definite, then 


Fad, 0, 0) = 43,1 = 1. 


If Ga+(x2,x3) < O for some integers x2, x3, then G4+(a),)X2,a11%3) = 


at ;Ga+(X2, x3) < 0. Let x1 = —(a1,2X2 + a1,3x3). Then 
Q),1X1 + ,24), 1X2 + A130) 1x3 = 0, 
and so 


41,1 F4(X1, a1,1%2, 41,1%3) 
_ 2 G 
= (Gy,1X1 + 41 ,241,1X2 + Ay,301,1X3)° + Ga+(Qy,1X2, A1,1%3) 


G a+(Qy,1X2, Qy,1X3) 


2 
ay ;Gar(X2, x3) 


< 0. 
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Since Fy, is positive-definite, it follows that x2 = x3 = 0, and so the binary form 
Ga,» is also positive-definite. By Lemma 1.1, the leading coefficient of Ga- is 
positive, that is, 

d’ =a),\02,2 — aj, > 1, 


and also the discriminant of G 4+ is positive, hence 
d = det(A) > 1. 


This proves that if F4 is positive-definite, then the integers a,,,,d’, and d are 
positive. 

Conversely, if these three numbers are positive, then Lemma 1.1 implies that 
the binary form G 4: is positive-definite. If F'4(x1, x2, x3) = 0, then it follows from 
identity (1.3) that 

G +(X2, x3) = () 


and 
Qy 1X1 + y.2X2 + a1,3x3 = 0. 


The first equation implies that x2 = x3 = 0, and the second equation implies that 
x, = 0. Therefore, the form F', is positive-definite. 


Lemma 1.4 Let B = (b;,;) be a 3 x 3 symmetric matrix such that the ternary 


quadratic form Fp is positive-definite. Let Gg- be the unique positive-definite 
binary quadratic form such that 


b1 1 Fp(y1, yo ¥3) = (B1.191 + O1,292 + b1,393)" + Gae(ya, ys). 
For any matrix V* = (vu; p> E SL2(Z), let 
A* =(V*)! B*V* (1.5) 


and let G+ be the positive-definite binary quadratic form corresponding to the 
symmetric matrix A* and equivalent to the form G z-. For any integers r and s, let 


1 +r S 
Vis = (v;,;) = 0 vl Ui» € SL3(Z) (1.6) 


* * 
0 Uy, V2 


and 
Ars = V,,BV1,s = (a;, ;). (1.7) 


Let F4,, be the corresponding ternary quadratic form. Then a,,, = b;,; and 
2 
11 F 4, (X1, X2, X3) = (41,141 + @1,2X2 + A1,3%3)" + Ga»(X2, X3), 


where the matrix A* defined by (1.5) is independent of r and s. 
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Proof. Since v;,; = 1 and v2; = v3, = 0, it follows from the matrix equa- 
tion (1.7) that 


3. 3 3. 3 


3 
T 
aj = ) » V1 OKAY, j = > | ) Uk,1DK iVi,j = y Dy i Vi, j 
1 


and so a;,; = b;,;. Let 


Xj y1 
x=] Xx and = Vpsx=y=] yo |, 
X3 ¥3 

SO 

3 

yi = > Ui, jx j 
j=l 
In particular, 
yo = v2, 1x1 + v2 2X2 + V2 3X3 = VT XZ HUT +X 
, ; ; 1,1%2 1,2%3 
y¥3 = v3, 1x, + U3,2X2 + U3 3X3 = Vz 1X2 + Vz 2X3. 
Let 
x 
y* = y2 and x" —_ 2 
3 X3 
Then 
V*x* = y* 
It follows that 
G p+(y2, ¥3) = Gae(V*x") = Gax(X2, x3). 
Moreover, 
3 3 
bi1y1 + Di,2y2 + b1,3y3 = > bi i » Vi, j*j 
i=] j=l 
3 3 
=) by iVi,j | x; 
j=l \i=l 

= A,,1X1 + Q) 2X2 + Qj 3X3. 

Since 


Fa, (21, X25 %3) =X" Aysx = (Vzsx)’ B(V,5x) = y" By = Fa(y1, ya, Y3)s 
it follows that 
(ay,1X + 41,2%2 + 41,3X3)" + Gar, (x2, 3) 
= 011 F 4, (%1, X2, x3) 
= by, 1 Fa, (x1, X2, X3) 
= bi 1 Fe(y1, ya, y3) 


= (b1,1¥1 + b1.2y2 + b1,393)" + Gae(y2, y3) 
= (Ay, 1X1 + 41,2X2 + 41,3X3)” + Gae(X2, X3), 
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and so 
G a+(X2, x3) = Gas (X2, X3) 


for all integers r and s. This completes the proof. 


Lemma 1.5 Let u, 1, U2,;, and u3,; be integers such that 


(U11,U2,1, 43,1) = 1. 


Then there exist six integers uj,; fori = 1,2,3 and j = 2,3 such that the matrix 
U =(u;,;) € SL3(Z), that is, det(U) = 1. 


Proof. Let (u;.1, ¥2,1) = a. Choose integers u; 2 and u2,2 such that 
Uy 1U2,2 — U2,1U1,2 =a. 
Since (a, 3,1) = (U1,1, U2,1, 43,1) = 1, we can choose integers u3,3 and b such that 
Qau3 3 — bu3 1 =]. 


Let 


U, 1D 

2 io ’ 
a 

U21D 

U23 = ’ 
a 


u32= 0. 


Then the matrix 


Ui 
Ui) 41,2 hai b 


U=(u,j)=| voi. ura. (B)b 


U3; O U3,3 
has integer coefficients and determinant 1. This completes the proof. 
Lemma 1.6 Every equivalence class of positive-definite ternary quadratic forms 


. oe . 3 . 
of discriminant d contains at least one form )_; jai Ui, jXiX; for which 


4 
2 max (|ai,21, |@1,31) < @1,1 < avd. 


Proof. Let F be a positive-definite ternary quadratic form of determinant d, and 
let C be the corresponding 3 x 3 symmetric matrix. Let a; ,; be the smallest positive 
integer represented by F’. Then there exist integers u;,1, U2,;, and uv3,; such that 


F(u}.1, U2,1, U3,1) = @),1. 


If (v1.1, 42.1, 43,1) = h, then the form F also represents a;,;/h?, and so, by the 
minimality ofa; 1, we have (uv; 1, U2,1, 43,1) = 1. By Lemma 1.5, there exist integers 
uj; fori = 1, 2,3 and j = 2, 3 such that the matrix U = (u;_;) € SL3(Z). Let 


B=U'CU =(b;,;). 
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Then F is equivalent to the form Fz, and 
by) =a) 
is also the smallest integer represented by Fg. By Lemma 1.3, 
1,1 p(X, X2, X3) = (bi, 1%) + bax, + by,3x3)° + Gp+(X2, x3), 


where G g+(x2, x3) iS a positive-definite binary quadratic form of determinant a, jd. 
By Lemma 1.2, the form Gg+(x2, x3) is equivalent to a binary form 
G ax(X2, X3) = af 1x9 + Af 2X2X3 + AZ 4X5 


such that 


2 
ai < —_= aid. 


J/3 
Choose V* € SL2(Z) such that A* = (V*)’ B*V*. Let r,s € Z, and let V,, € 
SL3(Z) be the matrix defined by (1.6) in Lemma 1.4. Let 
A=V, BV, = (ai,;). (1.8) 


Note that the integer in the upper left corner of the matrix is still a;,;, the smallest 
positive integer represented by any form in the equivalence class of F’, and that, 


by Lemma 1.3, 
* 2 
Qi = 41 1822 — @) 2: 


Finally, it follows from (1.8) that 
a4 2=airt bi,2U; 1 + bi ,3V> 4 


and 
* * 
413 =a, 15+ b1,20; 2 + b1,3V2 >- 


Therefore, we can choose r such that 


ai,1 
la;2| < a 
and choose s such that at 
lai3| < 3 
Since 
aii < Fa(O, 1,0) =a, 
we have 


2 
Qy 1 S 41,142,2 
2 2 
= 41,142,2 — A; tay 


— 7* 2 

= 4,1, +4; 
2 
a 
nS 

ay, 1d + = —— 


2 
FH 
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This implies that ; 
ay < (=) Jaiid 
or, equivalently, 
ais : Jd . 
This completes the proof. 


Theorem 1.3 Every positive-definite ternary quadratic form of discriminant | is 
equivalent to the form x? + x3 + x3, 


Proof. Let F be a positive-definite ternary quadratic form of discriminant 1. By 
Lemma 1.6, the form F is equivalent to a form F4 = )0 aj, ;x;x j for which 


4 
0 < 2 max ([aj,2|, |a1,31) < a1,1 < 3° 


This implies that a;,2 = a,,3 = 0. Since d + 0, it follows that a, + 0 and so 
a, , = 1. Therefore, 
1 O 0 
A={ 0 a2 a23 |, 
0 423 433 


At = a22 a23 
a2,3, 433 


has determinant 1. By Theorem 1.2, there exists a matrix 


* U22 U3 
U* = E SL(Z 
( U2,3 U3,3 ) AZ) 


where the 2 x 2 matrix 


such that (U*)’ A*U* is the 2 x 2 identity matrix /,. Let 


1 0 0 
U=|] 0 uw2 u23 
0 u23 33 


Then U’ AU is the 3 x 3 identity matrix J;. This completes the proof. 


1.5 Sums of three squares 


In this section, we determine the integers that can be written as the sum of three 
Squares. The proof uses the fact that a number is the sum of three squares if 
and only if it can be represented by some positive-definite ternary quadratic form 
of discriminant 1, together with two important theorems of elementary number 
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theory: Gauss’s law of quadratic reciprocity and Dirichlet’s theorem on primes in 
arithmetic progressions. 

The statement that a is a quadratic residue modulo m means that there exist 
integers x and y such that x? — a = ym. If p is prime and (a, p) = 1, then the 


Legendre symbol (4) is defined by ( a) = 1 if a is a quadratic residue modulo p 
and (s) = —] if a is not a quadratic residue modulo p. By quadratic reciprocity, 
if p and q are distinct odd primes, then (2) = (4) ifp=1 (mod 4)org=1 
(mod 4), and (2) =— (4) if p=q=3 (mod 4). Also, (=!) = 1 if and only 
if p=1 (mod 4), and (2) = 1 ifandonly if p=1or7 (mod 8). 

Lemma 1.7 Letn > 2. If there exists a positive integer d’ such that —d’ is a 


quadratic residue modulo d’n — 1, thenn can be represented as the sum of three 
squares. 


Proof. If —d’ is a quadratic residue modulo a’n — 1, then there exist integers 
a;,2 and a, ; such that 


2 , t 
ayyt+d =a,\(dn — 1) =4),142, 


where 
ag2=d'n—1>2d'-1>1 
and so 
ay, = 1. 
Equivalently, 


/ 2 
d = ,142,2 — aj 9. 


The symmetric matrix 
a1 a2 1 
A = Q12 a2,2 0 
1 QO +n 


has determinant 
det(A) = (a;,1a22 _— ay )n — a22= d'n — a22= 1. 


By Lemma 1.3, the quadratic form F'4 corresponding to the matrix A is positive. 
Moreover, F'4 has discriminant 1 and represents n, since F,4(0,0,1) = n. By 
Theorem 1.3, the form x? + x? + x? must also represent n. This completes the 
proof. 


Lemma 1.8 /f 7 is a positive integer andn = 2 (mod 4), then n can be 
represented as the sum of three squares. 
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Proof. Since (4n, n — 1) = 1, it follows from Dirichlet’s theorem that the arith- 
metic progression {4nj +n —1: j = 1,2,...} contains infinitely many primes. 
Choose j > 1 such that 


p=4njtn—1=(47+1)n-1 
is prime. Let d’ = 4j + 1.Sincen =2 (mod 4), we have 
p=dn—1=1 (mod 4). 
By Lemma 1.7, it suffices to prove that —d’ is a quadratic residue modulo p. Let 
d’=||a7', 
qi|d’ 
where the gq; are the distinct primes dividing d’. Then 
p=dn—1=-1 (mod q) 


for all i, and 


d' = I] (-1)* =1 (mod 4). 
gat tod 4) 


[] Gp*=1. 


qj id’ 
qj =3 (mod 4) 


Therefore, 


By quadratic reciprocity we have 


since p= 1 (mod 4), and 


(4) 
qild’ P 
k; 
40 
gi\d’ qi 
(=) 
qi\d’ i 


qjld’ 
qj=3 (mod 4) 


= 1. 


This completes the proof. 
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Lemma 1.9 [fn is a positive integer such thatn = 1,3, or5 (mod 8), thenn 
can be represented as the sum of three squares. 


Proof. Clearly, 1 is a sum of three nonnegative squares. Let n > 2. Let 


3 ifn=1 (mod 8) 
c= {1 ifn=3 (mod 8) 
3 ifn=5 (mod 8). 


Ifn =1lor3 (mod 8), then 


Ifn =5 (mod 8), then 


cn — 1 


In all three cases, 


By Dirichlet’s theorem, there exists a prime number p of the form 


cn — | 


p=4nj + 
for some positive integer j. Let 
d' =8j +c. 


Then 
2p =(8j +c)n—l=d'n—-1. 


By Lemma 1.7, it suffices to prove that —d’ is a quadratic residue modulo 2p. 
If —d’ is a quadratic residue modulo p, then there exists an integer xo such that 


(xo + p)’ +d’ =x2+d'=0 (mod p). 


Let x = Xo if xo is odd, and let x = x + p if xo is even. Then x is odd and x? +d’ 
is even. Since 
x?+d'=0 (mod 2) 


and 
x?+d'=0 (mod p), 


it follows that 
x?+d'=0 (mod 2p). 


Therefore, it suffices to prove that —d’ is a quadratic residue modulo p. 


1.5 Sums of three squares 21 


Let 


d' = ] [a 


qild’ 


be the factorization of the odd integer d’ into a product of powers of distinct odd 
primes q;. Since 
2p=-1 (mod d), 


it follows that 
2p = —] (mod gi) 


and 
(p,qi)=1 


for every prime q; that divides d’. 
Ifn =1lor3 (mod 8), then p=1 (mod 4) and 


(F)-G)G) 
( 


If n = 5 (mod 8), then p = 3 (mod 4) and d’ = 3 (mod 8). From the 
factorization of d’, we obtain 


d' = I] qi I] qi 
id! 


qj ld’ qj 
qj=1 (mod 4) qj=3 (mod 4) 


I] (—1)" (mod 4) 


qj 23 (nod 4) 
=-—1 (mod 4) 
and so 
]] @op*=-1. 
qj \d’ 
qgj=3 (mod 4) 


It follows from quadratic reciprocity that 


(F)-G)G) 
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k; k; 
qj id’ Pp qld’ Pp 
qj=! (mod 4) qj=3 (mod 4) 
k; kj 
qild’ qi gild" qi qild’ 
qj=1 (mod 4) q;=3 (mod 4) qj=3 (mod 4) 
k; k; 
qj |’ qi qi ld’ qi 
gj=! (mod 4) qj =3 (mod 4) 


It 
S 
a 
-—ee, 
a 
eee” 
= 


In both cases, 


I] co J] co 


qj \d’ qi ld’ 
q;=3.5 (mod 8) 9; =3,7 (mod 8) 


[] co. 


qj ld’ 
9; =5,7 (mod 8) 


Therefore, —d’ is a quadratic residue modulo 2p = d'n — 1 if 


>> ki =0 (mod 2). 
qj a5 7 ae &) 


This is what we shall prove. We have 


/ k; k; k; k; 
d= {Ta JT] a YT] a TT] @# 
qld’ qj\da’ qld’ qld’ 
qj=! (mod 8) 9;=3 (mod 8) q;=#5 (mod 8) qj=7 (mod 8) 
= [[ * J] © J] Cy* aod 8) 
qld’ qj ld’ qj ld’ 
qj =3 (mod 8) qj=5 (mod 8) q;=7 (mod 8) 
= |] 3 [] Cb (mod 8). 
qd’ qj\d’ 
qj =3,5 (mod 8) qj =5,7 (mod 8) 


Ifn =1lor5 (mod 8), then c = 3 and 
d’=8j+3=3 (mod 8). 
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This implies that 
y>  k& =1 (mod 2) 


qj id’ 
Gi =3,5 (mod 8) 


> k; =0 (mod 2). 


qi ld’ 
qj =5.7 (mod 8) 


Ifn =3 (mod 8), then c = 1 and 
d'=8j+1=1 (mod 8). 


It follows that 
> k; =0 (mod 2) 


qj ld’ 
q;=3.5 (mod 8) 


and 


> k; =0 (mod 2). 


qj\d’ 
qj =5,7 (mod 8) 


This completes the proof. 


Theorem 1.4 (Gauss) A positive integer N can be represented as the sum of three 
squares if and only if N is not of the form 


N =4°(8k +7). 


Proof. Since 
x2 =0,1, or4 (mod 8) 


for every integer x, it follows that a sum of three squares can never be congruent to 
7 modulo 8. If the integer 4m is the sum of three squares, then there exist integers 
X1, X2, X3 such that 

4m = x? +x3 +x}, 


This is possible only if x1, x2, x3 are all even, and so 


x1\2 x2\2 x3\2 
m=(>) (5) +G)- 
Therefore, 4°m is the sum of three squares if and only if m is the sum of three 


squares. This proves that no integer of the form 4°(8k +7) can be the sum of three 
squares. 

Every positive integer N can be written uniquely in the form N = 4°m, where 
m=2 (mod 4)orm =1,3,5, or7 (mod 8). By Lemma 1.8 and Lemma 1.9, 
the positive integer N is the sum of three squares unless m = 7 (mod 8). This 
completes the proof. 


Theorem 1.5 /f N is a positive integer such that N =3 (mod 8), then N is the 
sum of three odd squares. 
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Proof. Recall that x7 = 0,1, or 4 (mod 8) for every integer x. If N = 3 
(mod 8) is a sum of three squares, then each of the squares must be congruent to 
1 modulo 8, and so each of the squares must be odd. This completes the proof. 


1.6 Thin sets of squares 


If A is a finite set of nonnegative integers such that every integer from 0 to N can 
be written as the sum of / elements of A, with repetitions allowed, then A is called 
a basis of order h for N. A simple counting argument shows that if A is a basis of 
order h for N, then A cannot be too small. 


Theorem 1.6 Leth > 2. There exists a positive constant c = c(h) such that, if A 
is a basis of order h for N, then 


|A| > cN/?, 


Proof. Let |A| = k. If A is a basis of order h for N, then each of the integers 
0,1,..., NM is asum of h elements of A, with repetitions allowed. The number of 
combinations of h elements, with repetitions allowed, of a set of cardinality k is 
the binomial coefficient (“ie Therefore, 


_ a ~ Ih 
N+1l< k+h—-1 _kK+N) (kK+h ) ck 
h h!} h! 


for some constant c’ > 0 and all k, and so 
hin \*/" 
|A|=k> (—*) =cNi/" 
Cc 


This completes the proof. 
Since the squares form a basis of order 4, it follows that for every N > 0 the set 
Ow of all squares up to N is a basis of order 4 for N. Moreover, 


JOn|=1+[N'7] > Ni. 


This is much larger than cN'/*, which is a lower bound for the thinnest possible 
basis of order 4. It is natural to ask if for every N there exists a set Ay of squares 
that is a basis of order 4 for N and satisfies 


Norco NWB ~ ° 


The answer is provided by the following theorem. 


Theorem 1.7 (Choi—Erdos—Nathanson) For every N > 2, there exists a set Ay 
of squares such that Ay is a basis of order 4 for N and 


4 
|Ay| < () N' log N. 
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Proof. The sets Az = A3 = {0,1} and Ay = As = {0, 1, 4} satisfy the 
requirements of the theorem. Therefore, we can assume that N > 6. 

We begin with a simple remark. By Theorem 1.4, if 2 is a nonnegative integer 
and £=1or2 (mod 4), then @ is the sum of three squares. Since the square of 
an even integer is 0 (mod 4) and the square of an odd integer is 1 (mod 4), it 
follows that if m #0 (mod 4) and a is any positive integer such that a? < m, 
then either m — a? is the sum of three squares or m — (a — 1)” is the sum of three 
Squares. 

For N > 6, we let AY consist of the squares of all nonnegative integers up to 
2N'/3. Then 

JA | < 2N13 41. 


Let A® consist of the squares of all integers of the form 


[Ren] or [Ren] _ 1, 


where 
4<k<N'”. 
Then 
|A®| < 2(N1/9 — 3) = 2N"3 — 6, 
Let 
AY = AY UAL. 
Then 


JAW | < 4N¥3. 


Since A contains all the squares up to 4N/?, it follows from Lagrange’s theorem 
that every nonnegative integer up to 4N7/? is the sum of four squares belonging to 
(0) 
A®, 
Let m be an integer such that 


AN2/2 <m<WN 


and 
m#(Q (mod 4). 


We shall prove that there exists an integer ap € A such that 
0<m—aj, < 4N*? 


and m — ag is the sum of three squares. Since 


m 1/3 
4< Was <N'", 
it follows that 


_f m 1/3 
4<k= Fea <N’”’, 
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Let 
a = [kN], 


Then a? € A, (a—l)*e AY, 
a <kN*? <m<(k+1)N”, 


and 

a > ki/?N*3 — 4, 
It follows from our initial remark that either m — a* or m — (a — 1)? is the sum 
of three squares. Choose a} € {(a — 1)*, a7} A® such that m — a? is a sum of 
three squares. Since 4 < 3N'/° for N > 6, we have 


0<m-a 


<m-—a 
<m-—(a—1y 

< (k+1)N7? — (k'?N' — 2/° 

< (k+1)N7? —kN7? + 4k? N10 
= N2/3 4 4pl/2n1/3 

< N23 4 4N1/2 

< 4N2/3 


andsom —a? is the sum of three squares belonging to AY . Therefore, if0 < m < N 


andm #0 (mod 4), then m is the sum of four squares belonging to A”, 
Let 


| log N 
Ay = 4 (ia: 0<i<—=— and ac A}. 
log 4 


Then Avy is a set of squares and 


Ay| < {[——+1)]A “| 4n'? = | —— | N' log N. 
|Aw| < (2 ) vi< log 4 log2 08 


Letn € [0, N].Ifn #0 (mod 4), then n is the sum of four squares belonging 
to A C Ay. Ifn =0 (mod 4), then n = 4'm, where m #0 (mod 4) and 
0 <i < log N/log4. Then 
m =a? +a5 +a} +43, 
where @, @2, a3, a4 € A”, and so 
n = 4!'m = (2'a,)? + (2'az)* + (2'a3)? + (2'a4)” 


is a sum of four squares belonging to Ay. This completes the proof. 
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1.7. The polygonal number theorem 


We begin by proving Gauss’s theorem that the triangles form a basis of order three. 
Equivalently, as Gauss wrote in his journal on July 10, 1796, 


EYPHKA! num =A+A+A. 


Theorem 1.8 (Gauss) Every nonnegative integer is the sum of three triangles. 


Proof. The triangular numbers are integers of the form k(k + 1)/2. Let N > 1. 
By Theorem 1.5, the integer 8N + 3 is the sum of three odd squares, and so there 
exist nonnegative integers k,, k2, k3 such that 


8N +3 = (2k + 1)? + (2ko + 1)* + (2k3 +:1)° 
= Aki + ky + kb + kp + ko +k3) +3. 


Therefore, 
_ ki (ky + 1) + k2(k2 + 1) + k3(k3 + 1) 


N 
2 2 2 


This completes the proof. 

Lagrange’s theorem (Theorem 1.1) is the polygonal number theorem for squares, 
and Gauss’s theorem is the polygonal number theorem for triangles. We shall now 
prove the theorem for polygonal numbers of order m + 2 for all m > 3. It is easy 
to check the polygonal number theorem for small values of N/m. Recall that the 
kth polygonal number of order m + 2 is 


mk(k — 1) 
Pm(k) = a +k. 
The first six polygonal numbers are 
Pm(2) =m+2 
Pm(3) = 3m +3 
Pm(4) = 6m + 4 


Pm(5) = 10m +5. 


If k,,..., ks are positive integers, then, forr = 0, 1,...,m+2—s, the numbers 
of the form 


Pm(K1) + Pm(k2) + +: ° + Dm(Ks) +rpm(1) (1.9) 


are an interval of m + 3 — s consecutive integers, each of which is a sum of exactly 
m + 2 polygonal numbers. Here is a short table of representations of integers as 
sums of m + 2 polygonal numbers of order m + 2. The first column expresses the 
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integer as a sum of polygonal numbers in the form (1.9), and the next two columns 
give the smallest and largest integers that the expression represents. 


rDm(1) 0 m+2 
Pm(2) + rpmQ) m+2 2m+3 
2Pm(2) + rpm(1) 2n+4 3m+4 
Pm(3) + rpm(1) 3m+3 4m+4 
Pm(3) + Pm(2) +rpm(1) 4m+5 5m+4+5 
4Pm(2) +rpm(1) 4m+8 5m+6 
Pm(3)+2pm(2)+rpm(1) Sm+7 6m+4 
Pm(4) + rpm(1) 6m+4 Im+5 
Pm(4) + Pm(2) + rpm(1) 7m+6 8m+6 
2 Pm(3) + Pm(2) 7m+8 8m+7 


Pm(4)+2pm(2)+rpm(1) 8m+8 9m+7 
Pm(4) + Pm(3)+rpm(1) = 9m+7 10m+7 
Pm(5) +rPm(1) 10m+5 11m+6 
Pm(5) + Pm(2)+rpmQ) 11m+7 12m+7 


This table gives explicit polygonal number representations for all integers up to 
12m +7. It is not difficult to extend this computation. Pepin [95] and Dickson [23] 
published tables of representations of N as a sum of m + 2 polygonal numbers 
of order m + 2 for all m > 3 and N < 120m. Therefore, it suffices to prove the 
polygonal number theorem for N > 120m. 

We need the following lemmas. 


Lemma 1.10 Letm > 3andN > 2m. Let L denote the length of the interval 
1 6N 2 8N 
[= ~+,/— —3, =~+,/— -8 , 
2 m 3 m 


L>4 ifN => 108m 


Then 


and 
L>lm if€>3andN > 7é*m’. 


Proof. This is a straightforward computation. Let 


x=N/m>2 
and 
1 
fo =f — =. 
We see that I 
L=v8x—8—Vox—3+7>¢ 
if and only if 


V8x —8 > V6x —3+4 po, 
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or, after squaring both sides and rearranging, 
2x — £2 ~5 > Uy 6x — 3. 
Squaring and rearranging again, we obtain 
4x (x — (765 + 5)) + (65 + 5)? + 1282 > 0. 


This inequality certainly holds if 
1\2 
x > 18+5~7(¢- z) +5, 


Therefore, 


Since 


1\2 
7(4-2) +5 = 107.86..., 


it follows that L > 4if N > 108m. Since 
1\2 
72 >1(- 2) +5 


for £ > 3, it follows that L > £if 2 > 3 and N/m > 7£?. Therefore, if 2 > 3 and 
N > 7?m?, then L > £m. This completes the proof. 


Lemma 1.11 Letm > 3 and N > 2m. Let a,b, andr be nonnegative integers 
such that 
O<r<m 


and m 
N= Z(a—b)+b+r. (1.10) 


Consider the open interval 


ra (14./ON_ 3 274. /8% _¢ 
“\N2 Vm ~~’ 3 Vm 


If 

bel, 
then 

b* < 4a (1.11) 
and 


3a <b? +2b+4. (1.12) 


30 1. Sums of polygons 


Proof. From equation (1.10), we have 


(ro n)e(G)) 


By the quadratic formula, 


if 


If b € J, then 


This proves (1.11). 
Again by the quadratic formula, 


+2b+4-3a~b- (1-2) o~(6(5—) ~4) >0 
m 


if 


If b € I, then 


1 3 1 a 6N 
>{[-~-——]+ -~——|] +—-—4 
2 em 2 em m 
1 1 3) ~ 
> 1_3 + 57 =] +6 var — 4. 
2 em 2 em m 


This proves inequality (1.12). 
The following result is sometimes called Cauchy’s lemma. 
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Lemma 1.12 Leta and b be odd positive integers such that 
b? < 4a 
and 
3a <b? +2b+4. 
Then there exist nonnegative integers s,t,u, v such that 


a=s'+t?+u7+v" (1.13) 


and 
b=s+t+urtu. (1.14) 


Proof. Since a and b are odd, it follows that 4a — b? = 3 (mod 8). By 
Theorem 1.5, there exist odd positive integers x > y > z such that 


4a —b* =x? + y? +2’. 


We can choose the sign of +z sothatb+x+y+z=0 (mod 4). Define integers 
s,t,u, v as follows: 


b+x+y+z 
s = ——_———_ 


4 
b+x b+x-—yFz 
t= —- f= 
4 
_bty _b-x+y Fz 
2 - 4 
bz b—-x—-—yx+z 
7) 7 4 ° 


These numbers satisfy equations (1.13) and (1.14) and 
S>t[>u= v. 


We must show that v > 0. By Exercise 8, the maximum value of x + y + z subject 
to the constraint x2 + y? +z? = 4a — b? is 12a — 3b?. Also, the inequality 
3a < b*+2b+ 4 implies that 12a — 3b < b + 4. Therefore, 


x+yt+z< V12a — 3b? < b+4, 


and so 
y> Prey % 
— 4 
Since v is an integer, we must have v > 0. This completes the proof. 
The following result is a strong form of Cauchy’s polygonal number theorem. 


> —-l. 


Theorem 1.9 (Cauchy) [fm > 4and N > 108m, then N can be written as the 
sum of m +1 polygonal numbers of order m +2, at most four of which are different 
from 0 or 1. If N > 324, then N can be written as the sum of five pentagonal 
numbers, at least one of which is 0 or I. 
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Proof. By Lemma 1.10, the length of the interval 


[= Bee Lome a ea 
2 m 3 m 


is greater than 4 since N > 108m, and so J contains four consecutive integers 
and, consequently, two consecutive odd numbers b; and b2. If m > 4, the set of 
numbers of the form b+r, where b € {b;, bz} andr € {0, 1,..., m — 3}, contains 
a complete set of representatives of the congruence classes modulo m, and so we 
can choose b € {b;, b2} C I andr e€ {0, 1,..., m — 3} such that 


N=b+r_ (mod m). 


a-2("—=") 4y~(1-=)5+2(7—*) (1.15) 
m m m 


is an odd positive integer, and 


Then 


N=>(a—b)+b+r, 


By Lemma 1.11, since b € J, we have 
b* < 4a 
and 
3a < b?+2b+4. 
By Lemma 1.12, there exist nonnegative integers s, t, u, v such that 


2 


a=s7tt? +u*+v" 


and 


b=stt+uto. 


Therefore, 
N==(a—b)+b+r 
| == ( 7 _ st? —ttu?—ut+vy—v)+(st+ttutv)tr 
= Dm(S)+ P(t) + Pm(U) + Pm(v) +7. 


Since 0 < r < m —3 and since 0 and 1 are polygonal numbers of order m + 2 for 
every m, we obtain Cauchy’s theorem for m > 4, that is, for polygonal numbers of 
order at least six. To obtain the result for pentagonal numbers, that is, for m = 3, 
we consider numbers of the form b; +r and b> +r, where b,, b2 are consecutive 
odd integers in the interval 7, and r = 0 or 1. 
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Theorem 1.10 (Legendre) Let m > 3 and N > 28m’. If m is odd, then N is the 
sum of four polygonal numbers of order m + 2. If m is even, then N is the sum of 
five polygonal numbers of order m + 2, at least one of which is 0 or 1. 


Proof. By Lemma 1.10, the length of the interval J is greater than 2m, so I 
contains m consecutive odd numbers. If m is odd, these form a complete set of 
representatives of the congruence classes modulo m, so N = b_ (mod m) for 
some odd integer b € J. Letr = 0 and define a by formula (1.15). Then 


N= —(a—b)+b, 


and it follows from Lemma 1.11 and Lemma 1.12 that N is the sum of four 
polygonal numbers of order m + 2. 

If m is even and N is odd, then N = b_ (mod m) for some odd integer b € I 
and N is the sum of four polygonal numbers of order m + 2. If m is even and N is 
even, then N —1=b (mod m) for some odd integer b € J and N is the sum of 
five polygonal numbers of order m + 2, one of which is p,,(1) = 1. This completes 
the proof. 

A set of integers is called an asymptotic basis of order h if every sufficiently 
large integer can be written as the sum of h not necessarily distinct elements of 
the set. Legendre’s theorem shows that if m > 3 and m is odd, then the polygonal 
numbers of order m + 2 form an asymptotic basis of order 4, and if m > 4 and m 
is even, then the polygonal numbers of order m + 2 form an asymptotic basis of 
order 5. 


1.8 Notes 


Polygonal numbers go back at least as far as Pythagoras. They are discussed at 
length by Diophantus in his book Arithmetica and in a separate essay On polygonal 
numbers. An excellent reference is Diophantus of Alexandria: A Study in the 
History of Greek Algebra, by T. L. Heath [53]. Dickson’s History of the Theory of 
Numbers [22, Vol. I, Ch.1] provides a detailed history of polygonal numbers and 
sums of squares. 

There are many different proofs of Lagrange’s theorem that every nonnegative 
integer is the sum of four squares. For a proof using the geometry of numbers, see 
Nathanson [93]. There is a vast literature concerned with the number of representa- 
tions of an integer as the sum of s squares. Extensive treatments of these matters can 
be found in the monographs of Grosswald [43], Knopp [74], and Rademacher [98]. 
Liouville discovered an important and powerful elementary method that produces 
many of the same results (see Dickson [22, Vol. II, Ch. 11] or Uspensky and 
Heaslet [122]). 

Legendre and Gauss determined the numbers that can be represented as the sum 
of three squares. See Dickson [22, Vol. II] for historical references. In this chapter, 
I followed the beautiful exposition of Landau [78]. There is also a nice proof by 
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Weil [140] that every positive integer congruent to3 (mod 8) is the sum of three 
odd squares. 

Cauchy [9] published the first proof of the polygonal number theorem. Legen- 
dre’s theorem that the polygonal numbers of order m form an asymptotic basis of 
order 4 or 5 appears in [80, Vol. 2, pp. 331-356]. In this chapter I gave a simple 
proof of Nathanson [91, 92], which is based on Pepin [95]. 

Theorem 1.7 is due to Choi, Erdos, and Nathanson [13]. Using a probabilistic 
result of Erdos and Nathanson [36], Z6llner [152] has proved the existence of a 
basis of order 4 for N consisting of < N!/*** squares. It is not known if the e can 
be removed from this inequality. Nathanson [89], Spencer [118], Wirsing [145], 
and Zdllner [151] proved the existence of “thin” subsets of the squares that are 
bases of order 4 for the set of all nonnegative integers. 


1.9 Exercises 


1. Let m > 2. Show that the polygonal numbers of order m + 2 can be written 
in terms of the triangular numbers as follows: 


Pm(k) = mp\(k) +k 
for all k > 0. 


2. (Nicomachus, 100 A.D.) Prove that the sum of two consecutive triangular 
numbers is a square. Prove that the sum of the nth square and the (m — 1)-st 
triangular number is the nth pentagonal number. 


3. Let v(2) be the smallest number such that every integer N can be written in 
the form 
N = +x? + “se + xin. 
Prove that v(2) = 3. This 1s called the easier Waring’s problem for squares. 
Hint: Use the identities 


2x+1=(x4+1)? — x? 


and 
2x =(x +1)? —x?-1?. 


4. Prove that if m is the sum of two squares and n is the sum of two squares, 
then mn is the sum of two squares. Hint: Use the polynomial identity 


(x? + x?)(y? + y3) = (iy + X2y2)* + (X12 — X21)”. 


5. (Nathanson [88]) Prove that there does not exist a polynomial identity of the 
form 
(x? +x? + x3)(y? + y3 + y$) = 22 +25 423, 
where Z}, Z2, 23 are polynomials in x;, X2, X3, yj, y2, y3 With integral coef- 
ficients. 


10. 


11. 


12. 


1.9 Exercises 


. Prove that Theorem 1.4 implies Lagrange’s theorem (Theorem 1.1). 
. Prove that the set of triangular numbers is not a basis of order 2. 


. Let S? = {(x, y, z) € R?: x7 + y? +22 = 1}. Prove that 


{x+y+z: (x,y,z) € S?} = [—-V3, V3]. 


. Let 


n 


Fix, woe Xn) = ) Qj, jXiXj 
i, j=l 


and 


n 
Fp(X1,.--5%n) = ) bj, jXiX; 
i, j=l 


be quadratic forms in n variables such that 
Fia(X1,---,Xn) = Fa(X1,-.-, Xn) 


for all x,,...,x, € Z. Prove that a; ; = b;,; for alli.j =1,...,n. 


Let A be an nm x n symmetric matrix, and let F4 be the corresponding 


quadratic form. Let 
U = (uj, ;) 


and 
B =U" AU = (b;,;). 


Prove that 
bj j = Fay, j, U2, j,-++5Unj) 


for j =1,...,n. 
For N > 1, letk = Rai and 

A = {0,1,...,k — 1} U{k, 2k, ..., (kK — 1k}. 
Show that A is a basis of order 2 for N such that 


|A| < 2VN 41. 


Leth > 2,k > 2, and 
h-1 
A = {0} U | J{ajk' -a; =1,...,k—1}. 
=0 


Prove that A is a basis of order A for k” — 1 and 


|A| < A(k—-1) 41. 
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13. (Raikov [99], Stéhr [119]) Leth > 2 and N > 2". Let A be the set 
constructed in the preceding exercise with 


k=[NY"] +1. 
Prove that A is a basis of order h for N such that 


|A| < AN!/" +1, 


2 


Waring’s problem for cubes 


Omnis integer numerus vel est cubus; vel e duobus, tribus, 4,5,6,7,8, 
vel novem cubus compositus: est etiam quadratoquadratus; vel e duo- 
bus, tribus &c. usque ad novemdecim compositus &sic deinceps.! 


E. Waring [138] 


2.1 Sums of cubes 


In his book Meditationes Algebraicae, published in 1770, Edward Waring stated 
without proof that every nonnegative integer is the sum of four squares, nine cubes, 
19 fourth powers, and so on. Waring’s problem is to prove that, for every k > 2, 
the set of nonnegative kth powers is a basis of finite order. 

Waring’s problem for cubes is to prove that every nonnegative integer is the 
sum of a bounded number of nonnegative cubes. The least such number is denoted 
g(3). Wieferich and Kempner proved that g(3) = 9, and so the cubes are a basis 
of order nine. This is clearly best possible, since there are integers, such as 23 and 
239, that cannot be written as sums of eight cubes. 

Immediately after Wieferich published his theorem, Landau observed that, in 
fact, only finitely many positive integers actually require nine cubes, that is, every 


lEvery positive integer is either a cube or the sum of 2,3,4,5,6,7,8, or 9 cubes; similarly, 
every integer is either a fourth power, or the sum of 2, 3,..., or 19 fourth powers; and so 
on. 
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sufficiently large integer is the sum of eight cubes. Indeed, 23 and 239 are the 
only positive integers that cannot be written as sums of eight nonnegative cubes. 
A set of integers is called an asymptotic basis of order h if every sufficiently large 
integer can be written as the sum of exactly h elements of the set. Thus, Landau’s 
theorem states that the cubes are an asymptotic basis of order eight. Later, Linnik 
proved that only finitely many integers require eight cubes, so every sufficiently 
large integer is the sum of seven cubes, that is, the cubes are an asymptotic basis of 
order seven. On the other hand, an examination of congruences modulo 9 shows 
that there are infinitely many positive integers that cannot be written as sums of 
three cubes. 

. Let G(3) denote the smallest integer } such that the cubes are an asymptotic 
basis of order h, that is, such that every sufficiently large positive integer can be 
written as the sum of / nonnegative cubes. Then 


4 < G(3) <7. 


To determine the exact value of G(3) is a major unsolved problem of additive 
number theory. It is known that almost all positive integers are sums of four cubes, 
and it is possible that G(3) = 4. 

The principal results of this chapter are the theorems of Wieferich-Kempner 
and of Linnik. Because of the mystery surrounding sums of few cubes, we also 
include a section about sums of two cubes. We shall prove that there are integers 
with arbitrarily many representations as the sum of two nonnegative cubes, but 
that almost all numbers that can be written in at least one way as the sum of two 
nonnegative cubes have essentially only one such representation. 


2.2 The Wieferich-Kempner theorem 


The proof that g(3) = 9 requires four lemmas. 


Lemma 2.1 Let A and m be nonnegative integers such that m < A* and m can 
be written as the sum of three squares. Then 


6A(A* +m) 
is a sum of six nonnegative cubes. 
Proof. Let m,, m2, m3 be nonnegative integers such that 
m =m? +m +m. 
Then 
O<m<J/m<A 
fori = 1, 2,3, and 


3 
6A(A” +m) = 6A(A? + mj + m3 +m3) = > ((A+mi)? +(A—mi)’). 


i=] 
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This completes the proof. 


Lemma 2.2 Lett > 1. For every odd integer w, there is an odd integer b such 
that 
w=b°> (mod 2’). 


Proof. If b is odd and w = b° (mod 2°), then w is odd. Let b; and by be odd 
integers such that 
b; =b; (mod 2’). 


Then 2' divides 
b3 — b? = (by — by)(b3 + bob; + b?). 


Since b? + byb, + b? is odd, it follows that 2’ divides bz — by, that is, 
b; =b2 (mod 2’). | 

This means that if b, and b2 are odd integers such that 
0<b, <b) <2’, 

then 
b} #b; (mod 2'), 


and so every odd integer is congruent to a cube modulo 2’. This completes the 
proof. 


Lemma 2.3 If 
r > 10648 = 22?, 


then there exists an integer d € [0,22] and an integer m that is a sum of three 
squares such that 
r=d°>+6m. 


Proof. If the nonnegative integer m is not the sum of three squares, then there 
exist nonnegative integers s and ¢ such that 


m = 45(8t +7), 


and so 


0 (mod 96) ifs>2 
72 (mod 96) ifs=1 
42 (mod 96) ifs =Oandt is even 
90 (mod 96) ifs =O and t is odd. 


6m =6-4°(8t+7) = 


It follows that if m is a positive integer and 


6m =h (mod 96) 
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heH = {6, 12, 18, 24, 30, 36, 48, 54, 60, 66, 78, 84}, 


then m is the sum of three squares. The following table lists, for various h € H 
and 


d € D= {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 17, 18, 22}, 
the least nonnegative residue in the congruence class 
d°>+h (mod 96). 


The elements of 7 are listed in the top row, and the elements of D are listed in the 
column on the left. 


0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
13 
14 
15 
17 
18 
22 


Every congruence class modulo 96 appears in this table. Since 0 < d < 22 for 
all d € D, it follows that if r > 22°, then there exists an integer d € D such that 
r — d° is nonnegative and r — d> = h_ (mod 96) for some h € H. Therefore, 
r — d> = 6m, where m is the sum of three squares. This completes the proof. 


Lemma 2.4 /f1 < N < 40, 000, then 
(i) N is a sum of nine nonnegative cubes; 


(ii) if N #23 or 239, then N is a sum of eight nonnegative cubes; 


2.2 The Wieferich-—Kempner theorem 


(iii) if N 423 or 239 and if N is not one of the following fifteen numbers: 


IS 22 50 114 167 
175 186 212 231 238 
303 364 420 428 454 


then N is a sum of seven nonnegative cubes; 


(iv) if N > 8042, then N is a sum of six nonnegative cubes. 


Proof. Let s(V) denote the least integer h such that N is the sum of h nonnegative 
cubes. Von Sterneck computed s(V) for all N up to 40,000. The four statements in 
the lemma are obtained by examining von Sterneck’s list of values of s(N). Using a 
computer, one can quickly verify (and extend) von Sterneck’s list (see Exercise 8). 


Theorem 2.1 (Wieferich-Kempner) Every nonnegative integer is the sum of nine 


nonnegative cubes. 
Proof. We shall first prove the theorem for integers 
N > 8!°. 


Let 
n= [Ni] . 


Then 
710 <n< 2. gk+l 


There exists an integer k > 3 such that 


8-8 < N <8. Qh), 


Let 
N; = N —i?. 
Fori = 1,...,m we have 
d; = N;_-, — N; =i? —(i —1) =3i7 —3i +1 


2k+3 
<3i2 <3N2/3 < 3-3" 


Choose i so that 
Nia) < 8 > gh < N;. 
Then i > 1. Since k > 3, we have 
N, = N—n? 
<(n+1y —n?-1 
= 3n? +3n 
< 6n’? 
<3. Q2k+3 


< 8.3. 
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Therefore, i < n — 1. It follows that 


N; < Ni-1 = (Ni-1 — Ni) + (Ni — Niat) + Nis 
= dj + dist + Nis 
< 3. R243 4.9. Qrk 
1138". 
Since N;_; — N; = d; is odd, exactly one of the integers N; and N;_; is odd. Choose 


a € {i—1,i} such that N, = N —a? is odd. By Lemma 2.2, there is an odd integer 
b € [1, 8* — 1] such that 


N-a’=b° (mod 8). 


Then 
1B a8. 8 ak SN og = b aN. 118" 

and 

N-a—b> =8q, 
where 

128" 2o-= 1138". 
Let 

r=q—6-8*, 

Then 


IP 2S? = 8 7 528%, 


It follows from Lemma 2.3 that r can be written in the form 
r=d>+6m, 


where 0 < d < 22 and m is asum of three squares. Let 


A =8* 
Then 
m< 7A < é < AX. 
Let 
c=2d 
Then 


N =a? +b +8*q 
=a? +b>+8*(6-8* +r) 
=a? +b? +8*(6 - 8% +d? + 6m) 
=a? +b +(2*dy + 8*(6 - 8% + 6m) 
=a3+b> +02 +6A(A2 +m). 
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By Lemma 2.1, 6A(A2 +m) is asum of six nonnegative cubes, so N is the sum of 
nine nonnegative cubes. 


Now let 
40,000 < N < 8'°. 
Then 
a =[(N — 10, 000)'/?] > 30, 000!” > 31, 
SO 
d=(at+1) —a? =3a’ +3a+1 < 4a? < 4N7”. 
Therefore, 


N —(a+1) < 10,000 < N—a’=N —(a+1)>+d < 10,000+4N72”. 


If N —a? < 40, 000, then N —a? is asum of six nonnegative cubes by Lemma 2.4. 
If N — a? > 40, 000, then we choose the integer 


b =[(N — a’ — 10, 000)'/"] > 31, 
and obtain 
N —a? —(b+1) < 10,000 < N —a°® — bb < 10,000+ 4(N — a*)*”. 


If N — a>? — b? < 40, 000, then N — a? — b? is a sum of six nonnegative cubes by 
Lemma 2.4. If N — a? — b? > 40, 000, then we choose the integer 


c=[(N — a’ — b° — 10, 000)'”"] > 31 
and obtain 


N —a? —b? —(c+1)° 
< 10,000 
<N-a-b-c 


< 10,000+4(N — a? — b>)" 


2/3 
< 10,000+4 (10, 000 + 4 (10, 000 + 4N72/3)”/ ’) 
2/3\2/3\ 7/° 
< 10,000+4 (10 000+4 (10, 000 + 4 (8°) ) ) 
< 20, 000. 


Thus, if 40,000 < N < 8!°, then there exist three nonnegative integers a, b, and 
c such that 
10,000 < N —a*° — bh? —c? < 40, 000. 


By Lemma 2.4, N — a? — b? — c° is the sum of six nonnegative cubes. This 
completes the proof. 
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2.3. Linnik’s theorem 


Let G(3) denote the smallest integer s such that every sufficiently large integer is 
the sum of s nonnegative cubes. 


Theorem 2.2 [fN = +4 (mod 9), then N is not the sum of three integral cubes. 
In particular, 
G(3) > 4. 


Proof. Since every integer, positive or negative, is congruent to 0,1, or —1 
modulo 9, it follows that every sum of three cubes belongs to one of the seven 
congruence classes, 0, +1,+2,+3 (mod 9). Therefore, if N = +4 (mod 9), 
then N cannot be the sum of three cubes, so G(3) > 4. 


Lemma 2.5 Let n be a positive integer. If there exist distinct primes p,q,r such 


that 
P=q=rz=-1 (mod 6), (2.1) 
r<q < 1.02r, (2.2) 
sp°q' <n < pq’, (2.3) 
4n = p*r'8 (mod q°), (2.4) 
2n = p’q'® (mod r°), (2.5) 
n=3p (mod 6p), (2.6) 


then n is the sum of six positive integral cubes. 


Proof. It follows from (2.2) and (2.3) that 


p?(4q'8 + 2r'8) < 6p>q'® 
< 8n 


< 8p°q'® 


< p°(4q!® + 4(1.02r)!%) 
< p(4q'8 + 8r'8), 
Thus, 
p’(4q'® +. 2r'*) < 8n < p?(4q'* + 8r'*). (2.7) 
Congruences (2.6), (2.4), and (2.5) imply that 
8n = 2p*r'® = p*(4q!8 + 2r!8) + 18pq°r® (mod q°), 
8n = 4p?qe = p°(4q"8 + 2r!8)+18pq*®r® (mod r°), 
8n = 0 = p*(4q!® + 2r!®)+18pq*r® (mod p), 


SO 
8n = p?(4q'® + 2r'8) + 18pq°r® (mod pq°r®). (2.8) 
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It follows from (2.1) and (2.6) that 
n=3p=-3=3 (mod 6), 


SO 
8n = 24 (mod 48). (2.9) 


By (2.1), the primes p, qg, r are odd; hence 
p?=q’ =r*=1 (mod 8) 


and | 
Dp? (2q"* +r'®) +9pq°r? = (2+1)p+tp=4p =4 (mod 8). 


Therefore, 
p*(4q'8 + 2r!8) + 18pq°r® = 8 (mod 16). 


Similarly, since p = gq =r=-—1 (mod 3), we have 
p’(4q'8 + 2r'®) + 18pq°r° =0 (mod 3) 


SO 
p?(4q'8 + 2r}8) + 18pq°r® = 24 (mod 48). (2.10) 


Since (pqr, 48) = 1, we can combine (2.8), (2.9), and (2.10) to obtain 
8n = p°(4q'® + 2r!8) + 18pq°r® (mod 48pq°r°). 
Therefore, there exists an integer u such that 
8n = p?(4q!® + 2r'8) + 18pq®r® + 48 pq®r®u 
= p*(4q'8 + 2r!8) + 6pq®r®(8u + 3). 
It follows from (2.7) that 
0 < 6pq°r®(8u + 3) < 6p°r'®, 


SO 
0 < 8u+3 < p’q°*r”™. 


By Theorem 1.5, 
8u+3 =x + y? +27, 


where x, y, z are odd positive integers less than pq~°r°, that is, 
max{q°x, q°y, g°z} < pr®. (2.11) 
Therefore, 
8n = p?(4q'® + 2r'8) + 6pq®r®(x? + y? +z’) 
= (pq? +r°xy + (pq? — xy + (pq? +r yy 
+(pq° — r°y)’ + (pr? +q°z)? + (pr® — q°zy’. 
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Since each of the six integers p,q,r, x, y, z is odd, it follows that each of the 
six Cubes in the preceding expression is even. Moreover, each of these cubes is 
positive, since, by (2.2) and (2.11), 

O<rx <q>x < pr® < pq®, 


0<ry <q*y < pr® < pq’, 
and 
0 < q?z < pr®. 


+r-x ; — rx ; 6473 3 
(FI) PS) PP) 
2 
—p3y\? r° + q>z r6 — g3z\° 
+ pq? —r°y y + Pp q + 14 qz 
Qo 2 2 


is a sum of six positive cubes. 


Therefore, 


Theorem 2.3 (Linnik) Every sufficiently large integer is the sum of seven positive 
cubes, that is, 
G(3) < 7. 


Proof. Let k and £@ be integers such that k > 1 and (k, £) = 1. We define the 
Chebyshev function for the arithmetic progression £ modulo k by 
O(x;k,£)= >> logp. 


psx 
p= (mod k) 


The Siegel-Walfisz theorem states that for any A > 0 and for all x > 1, 


Xx X 
O(x;k, £) = o(k) +O (a5) ; (2.12) 


where (k) is the Euler g-function, and the implied constant depends only on A. 
It follows that, for any 5 > 0, 


Ox x 


Let k = 6, 2 = —1, 6 = 1/50, and x = (50/51)(log N)*. For any integer N > 2, 


> log p 


(50/51)(log N)? <p<ilog N)2 
p=-1! (mod 6) 


= 0 ((log N)*; 6, —1) — 3((50/51)(log N); 6, —1) 


_ (log N)? + o( (log N)? ) 
102 (log log N)4 ] © 
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Since 
S> logp < ) logp <logN, 
p|N p|N 
p=-—1 (mod 6) 


it follows that, for N sufficiently large, there must exist at least two prime numbers, 
q and r, such that 
q=r=-1l1 (mod 6), 


(q,N)=(7,N)=1, 
and 


50 51 
= (log NY? <r <q < (log) < = = 1.02r. 


The multiplicative group of congruence classes relatively prime to q° is cyclic of 
order v(q°) = q°(q — 1). Since g = —1 (mod 6), it follows that (g(q°), 3) = 1, 
so every integer relatively prime to qg° is a cubic residue modulo q°. Similarly, 
every integer relatively prime to r° is a cubic residue modulo r°. Since 


(2Nr, gq) = (2Nq,r)=1, 
there exist integers uv and v such that 


(u,q)=(v,r) =1, 
4N =u>r'® (mod q°), 


and 
2N =vq'® (mod r°). 


The numbers 6, q°, andr® are pairwise relatively prime. By the Chinese remainder 
theorem, there exists an integer £ such that 

£=u_ (mod q°), 

£=v (mod r’*), 

£=-—1 (mod 6). 


Then 
4N = €°r'!® (mod q°) 
and 
2N =£?q'* (mod r°). 
Let 
k = 6q°r°. 
Then 
(k, £) = (6q°r®, £) = 1. 
Let 


x= N'3q7°, 
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Since q < (log NV)’, we have, for N sufficiently large, 
1 1 1 
logx = 3 log — 6logg > 3 log N — 12loglog N > 408 


and 
k = 6q°r® < 6(log N)** < 6(4log x)** < (log x)**. 


By the Siegel-Walfisz theorem with A = 25 and 6 = 1/50, 


9((51/50)x;k, 2) — O(x;k, 2) = sop +O (<=) 


> * +O * 
~ 50k (log x)?> 


> * +O * 
(log x)*4 (log x)? 
> 0. 


Therefore, if N is sufficiently large, there exists a prime p such that 


51x 
—  =1.02 
X<P<—>H 02x 


and 
p= (mod 6q°r°). 


The primes p,q,r are distinct because (gr, 2) = 1. Since p = —1 (mod 3), 
every integer is a cubic residue modulo 6p, and there exists an integer s such that 


s*>=N-—3p (mod 6p). 
By the Chinese remainder theorem, there exists ¢ such that 
t?=N-—3p (mod 6p), 


t=0 (mod q?r’), 


and 
1 <t <6pq’r’. 
Let 
n=N-—?°. 

Then 

4n=4N —4° =4N=0r' = pr? (mod q°), 

Qn =2N — 22? =2N = 07g" = p*q® (mod r°), 

n=N—t?=3p (mod 6p). 

Finally, 


n=N—-t<N=xq'* < p’q” 
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and 


n=N-?f 
> x3q'8 _ 216p°q°r° 
> (1.02)~3 p?q!8 — 216p°q'? 


3 3 3 
= GP qh + ((«1.02) > 1) q° - 216) pq 


for N sufficiently large. Thus, the integer n = N —t? and the primes p, q, r satisfy 
conditions (2.1)-(2.5) of Lemma 2.5, so N — t? is a sum of six positive cubes. 
Since ¢ is positive, we see that N is a sum of seven positive cubes. This proves 
Linnik’s theorem. 


2.4 Sums of two cubes 


The subject of this book is additive bases. The generic theorem states that a certain 
classical sequence of integers, such as the cubes, has the property that every non- 
negative integer, or every sufficiently large integer, can be written as the sum of 
a bounded number of terms of the sequence. In this section, we diverge from this 
theme to study sums of two cubes. ” This is important for several reasons. First, it 
is part of the unsolved problem of determining G(3), the order of the set of cubes 
as an asymptotic basis and, in particular, the conjecture that every sufficiently large 
integer is the sum of four cubes. Second, the equation 


N=x°+y? (2.13) 


is an elliptic curve. If 73.2(N) denotes the number of representations of the integer 
N as the sum of two positive cubes, then 732(N) counts the number of integral 
points with positive coordinates that lie on this curve. Counting the number of 
integral points on a curve is a deep and difficult problem in arithmetic geometry, 
and the study of sums of two cubes is an important special case. 

If N =x>+y? and x + y, then N = y? +x? is another representation of N as a 
sum of two cubes. We call two representations 


N=xp+yp =x +93 
essentially distinct if {x,, y;} + {x2, yo}. Note that N has two essentially distinct 


representations if and only if r3.2(N) = 3. 


*This section can be omitted on the first reading. 
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Here are some examples. The smallest number that has two essentially distinct 
representations as the sum of two positive cubes is 1729. The representations are 


1729 = 1° + 12? = 9° + 10°. 
These give four positive integral points on the curve 
1729 =x? +y’, 


SO 
r3.2(1729) = 4, 


The smallest number that has three essentially distinct representations as the sum 
of two positive cubes is 87,539,319. The representations are 


87539319 = 167° + 436° 
= 228° + 423° 
= 255° + 414°. 


The cubes in these equations are not relatively prime, because 
(228, 423) = (255, 414) = 3. 


The smallest number that has three essentially distinct representations as the sum 
of two relatively prime positive cubes is 15,170,835,645. The representations are 


15, 170, 835, 645 = 24687 + 517° 
= 2456° + 709° 
= 2152? + 1733°. 


The smallest number that has four essentially distinct representations as the sum 
of two positive cubes is 6,963,472,309,248. The representations are 


6, 963, 472, 309, 248 = 2421° + 19, 0837 
= 5436° + 18, 9487 
= 10, 200° + 18, 072° 
= 13, 322° + 16, 630°. 


It is an unsolved problem to find an integer N that has four essentially distinct 
representations as the sum of two positive cubes that are relatively prime. 

In this section, we shall prove three theorems on sums of two cubes. The first is 
Fermat’s result that there are integers with arbitrarily many representations as the 
sum of two positive cubes, that is, 


lim sup 73.2(NV) = oo. 
N->0oo 
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Next we shall prove a theorem of Erdos and Mahler. Let C2(n) be the number of 
integers up to n that can be represented as the sum of two positive cubes. Since 
the number of positive cubes up to n is n'/°, it follows that C2(n) is at most n?/?. 
Erdos and Mahler proved that this is the correct order of magnitude for C2(n), that 
iS, 

Cn)= Yo Ln, 


N<n 


r3,2(N)21 


However, numbers with two or more essentially distinct representations as sums 
of two cubes are rare. Erdos observed that the number C}(n) of integers up to n 
that have at least two essentially distinct representations as the sum of two cubes 
is o(n7/?), More precisely, we shall prove a theorem of Hooley that states that 


C3 (n) <x no/re 


This implies that almost every integer that can be written as the sum of two positive 
cubes has an essentially unique representation in this form. 


Theorem 2.4 (Fermat) For every k > 1, there exists an integer N and k pairwise 
disjoint sets of positive integers {x;, y;} such that 


N =x) +y; 
fori =1,...,k. Equivalently, 


lim sup 73,2(NV) = oo. 
N~—> oo 


Proof. The functions 


x(x? + 2y?) 
f(x, y) = - 
x3—y 
and 
y(2x° + y°) 
g(x, y) = >? - 
x" —y 


satisfy the polynomial identity 


f(x, yy — g(x, yp =xP + y?. 


If 
u(u? — 2v°) 
Flu, v) = Bap = f(u, —v) 
and 3 3 
v(2u’ — v) 
G(u, v) = pty = —ge(u, —v), 
then 


Fu, vy’ + Giu, v) = Flu, —y)y — glu, —vy) =u+(—vp =u —v’. 
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Let 


en 
4 


Let x; and y; be positive rational numbers such that 


O< Zt <&, 
x 
We define 
u= f (x1, y1), 
v = g(x, yi). 


Then u and v are positive rational numbers such that 


ee eee 
uu—v =x; +y; > 0. 


Moreover, 
u_ x(x? + 2y?) _ x 1+2° 
7 1+ p3/2)” 


where p = y;/x; € (0, 1/4). Since 


1+29° 30° 30° 
l< Bcd Ae = |] + oes < 1+ = iii 
1+ 3/2 2+? 2 
it follows that 
ux, 3xp> 3x, (n\> 3fy\* — 3e? 
<--— < =—{—]}] =-[—] < — 
v 2y 4y, 4y, \x 4\x1 4 
and 
u Xj 
-> ——>— >2. 2.14 
v . 2y1 ‘ 2€ ~ ey 
Next, we define 
x2 = Fu, v), 
y2 = Gu, v). 


Since u > 2v, it follows from the definition of the functions F(u, v) and G(u, v) 
that x2 and y2 are positive rational numbers. Moreover, 


3 3 3 3 3 3 
Xx +n =U —-vU =X, + yj- 


Let o = v/u. Then 
O<o0 <2e< 1/2 


2.4 Sums of two cubes 


by (2.14) and 


x2 u(u? — 2v?) 
yo  v(2u3 — v?) 


_ iu 1 — 20? 
~ Iv \1—03/2 


Since 


it follows that 


2v yy 2u\2-o7 4u 
Thus, 
X2 Xx] X2 u liu x1 3€ 367 
oA) |(P_ fy fe St ce B+ < 2¢, 
y2 4y, y2 2v 2\v 2y1 2 8 
and so 
X2 x1 
“> -e>—-2e> — 
2 Oy E> ke E> > 0 


This proves that if x; and y are positive rational numbers such that 


O<rt ce < 1/4, 
xX} 


then there exist positive rational numbers x2 and y2 such that 


3, .3_.3, .3 
Xn + yz =X, + Yj, 


0< — <8e, 
X2 
and 4 
x x 
a? ot < 8e. 
y2 y1 


If 8e < 1/4, then there exist positive rational numbers x3 and x4 such that 


3, .3_ .3,.3 
X3 + Y3 =X +2, 
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0< 2% < 8s, 


and 4 
x x 
3 _ 2 < 8%. 
3 2 


Similarly, if k > 2 and 


] 
0 gk—2 _, 
< ée< 4 


then there exist positive rational numbers x), y,, x2, y2,..., Xk, Ye Such that 
3, 3 3, .3 3, .3 
Xp Vy) =XQZ HV =r HX TM, 
yi i-1 . 
0O<— <8 ¢e fori=1,...,k, 
Xj 
and 
AXie1 Xi 


< 8'¢ fori=1,...,k—1. 


Yi+1 yi 


Let ¢ = 8-*. We shall prove that the k sets {x;, y;} are pairwise disjoint. Since 
poe _ Wo x4 5-1 < 4i-1 ; giti-l, _ gi . 32/15 


Yi+j Yi+j—1 


for j =1,...,k —i, it follows that 


A xiet Hi Wing A iajm 
Vi+e yi j=l Yi+j Yi+j-1 
£ 
< 8'e ) 32/71 
j=l 
< 8'32°¢ 


forl <i<i+¢<k. If x; =xj4¢ and y; = y;¢ for some 2 > 1, then 


Xize Xi 
Yi+e 7 Yi 
and 
SF (gt yt 2 | _ Ie gigate 
yi Ji Yi+e Ji 
It follows that 


A 
00 
XY 
J 
Uo 
N 
es 
oy 
Nw 


A 
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which is absurd. Therefore, {x;, yi}, ..., {xx, ye} are k pairwise disjoint sets of 
positive rational numbers. Let d be acommon denominator for the 2k numbers xj, 
seey Xk, W15-++9 Vk and let NV = (dx,)° + (dy). Then {dx1, dy;}, wey {dxx, dy;} 
are pairwise disjoint sets of positive integers, and 


(dx,)° + (dy,)° = (dx2)° + (dy2)? =-+- = (dx)? + (dy = N, 


that is, 73 2(N) > k. This proves Fermat’s theorem. 
Next, we shall prove the Erdos—Mahler theorem. This requires four elementary 
lemmas. 


Lemma 2.6 Leta and b be positive integers such that 
a<b. 
Let r(a, b) denote the number of pairs (x, y) of integers such that 
xet+(a—xyp=yi+(b- yy (2.15) 


and 
a b 
O<x<5 and O<y<>5. (2.16) 


Then 
r(a, b) < 5a?’?. 


Proof. The function 
fa(x) =x? +(a —x/) = 3ax? —3a7’x +a? 
is strictly decreasing forO < x < a/2.Letr = r(a,b) > 1. Let (x, y1), ..., 


(x;, y-) be the distinct solutions of equation (2.15) that satisfy inequalities (2.16), 
and let 


O<x,;<-::-<x,< =: 

Then 

b° b 3 

4 = to 7 < fo) = Fa(%1) < ta (0) =a, 
and so 

a<b<4'?q <2a. (2.17) 
Fori =1,...,7 —1 we have 
SoQi+1) = faiia1) < falxi) = foOi), 

and so 


O<y<i< <5. 
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Moreover, the point (x;, y;) is a solution of equation (2.15) if and only if (x;, y;) 
lies on the hyperbola 


where a 
= 9 > 0. 
Fori=1,...,7, let 
a 
ui = 5 — % 
and 
b 
MT ie 
Then 


a 
O<u,<-:-<uy <=, 
2 


O<v, <-::- <v; < =, 
| 2 
and (u;, v;) is a point in the first quadrant of the wv-plane lies on the hyperbola 
au* — bv* =c. 
Since the hyperbola is convex downwards in the first quadrant, it follows that 
Vi+l — Ui Ui — Vi-1 
ee > a 
Ui+) — Uj Uj — Uj-] 
fori =2,...,7—1, and so ther — 1 fractions 
Viel — Vi _ Vit — Di 
Uj+1 ~Uj Xi41 — Xi 


are distinct fori = 1,...,7 — 1. If7; is the number of points (x;, y;) such that 
qi3 
Xi+1 — Xi > 3” 

then 

a‘/r, a 

2 2’ 
and so 
rny< q?!? | 


Similarly, if rz is the number of points (x;, y;) such that 


qil3 


Yi4l — Vi > 3? 
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then 
a'/3r, b 
— <a 
2 
by (2.17), and so 
r2 < 2a?!?, 


qQi3 
1<Xj4. —% < > 
and 
qi3 
1<yni-y< 3 
Since the fractions 
Yi+l — Ji 
Xj4] — Xi 


are distinct, and the numerators and denominators are bounded by a!/? /2, we have 


qi3 2 q2/3 
r35|(>-] = 7: 
2 4 


Therefore, 


2/3 2/3 


q2!3 
r(a,b) <r, +r2+734+1 < 3a + +1 < Sa 


This completes the proof. 
Lemma 2.7 Let x and y be positive integers, (x, y) = 1. Ifthe prime p # 3 divides 
xe +y? 
x+y” 


then 
p=1_ (mod 3). 


Proof. Let p + 3 be a prime such that 


34 3 

x34 
x? —xyty’= » 
x+y 


=(0 (mod p). 


If p divides y, then p also divides x, which is impossible because (x, y) = 1. 
Therefore, (p, y) = 1. Since 


(2x — y)?+3y?=0 (mod p), 
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it follows that —3 is a quadratic residue modulo p. Let (+) be the Legendre 
symbol. By quadratic reciprocity, we have 


—3 
@)-6)- 
p 3 
if and only if p= 1 (mod 3). This completes the proof. 
In the proof of the next lemma, we shall use some results from multiplicative 
number theory. Let m(x;3,2) denote the number of primes p < x such that 


p = 2 (mod 3). By the prime number theorem for arithmetic progressions, 
7m (x; 3,2) ~ x/(2 log x). Moreover, there exists a constant A such that 


1 1 ] 
> 57 7 loBlogs + 4+ 0 ( ). 
psx Pp 2 log x 
p=2 (mod 3) 
This implies that 

1 1 ] 1 

— = — loglogx — = loglogx!/!! +0 (as) 
pay p 2 2 log x 
p=2 (mod 3) 


Lemma 2.8 For any positive integer a, let h(a) denote the largest divisor ofa 
consisting only of primes p = 1 (mod 3), that is, 


h(a) = I] pr. (2.18) 
pat 9) 


Let H(x) denote the number of positive integers a up to x such that h(a) < a\/' 
and a is not divisible by 3. There exists a constant 5, € (0, 1) such that 


H(x) > 61x 
for all x > 2. 


Proof. Let Ho(x) denote the number of positive integers a < x of the form 
a = pb, where p = 2 (mod 3) is a prime such that p > x!9/!1, and b is an 
integer not divisible by 3. An integer a has at most one representation of this form. 
Moreover, 


h(a) =h(b) <b == <x¥ll < pll0 < gio 
Pp 
It follows that every number of the form pb is counted in H(x), and so 


Ho(x) < H(x). 
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Also, Ho(2) = H(2) = 1. Let g(x) denote the number of positive integers up to x 
not divisible by 3. Then | 


oe 2x 1 
x —wevnet ew 
e 3 
and 
x 
Ho(x) = 2 :(=) 
rl0/1 cp<y p 
p=2 (mod 3) 
2% ; 
> —-l 
pay & 
p=2 (med 3) 
2 ] 
>= ~~ x(x;3,2) 
3 glO/M pes 
p=2 (mod 3) 
2x {1 1] 1 x 
= — {| -log —+0O{|— ]]+O[(|— 
3 (; oe 10 (=-)) (=) 
11 x 
= — log — ++0O | —— 
3 ee 10 (=) 
> Xx. 
This completes the proof. 


Lemma 2.9 Let p(d) be the Euler g-function, and let 0 < 6 < 1. There exists a 
constant c; = c;(6) > O such that, ifn is a positive integer and t > 6n, and if 


ay<-:+<a,<n 


are any t positive integers, then 


t 
» y(a;) > cn’. 


i=] 


Proof. For any p > 7, we have 


9) P —72)k 
“1-24 (? : , 

P ta \K/ P 

OF ie 2° 
<1—--+) p*— 

Pp 2, ps 

ee ate 
1266) 

Pp k=2 Pp 
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2 4 
<1|]——+——_— 
Pp p(p-2) 
1 
<l]-—-. 
p 


Since the infinite product 


converges, we have 


where 
0O<c <1. 


Since g(d) =d IT pa (1 — 1) and n! > (n/e)", it follows that 


feo fen) 


d=] pid 


-n'T] ( - er 


psn 


>(—). 
e 
Choose c3 > 0 so that 


Let 


2 


Suppose that there exists a set D C [1, n] such that |D| = m+ 1 and g(d) < c3n 
for all d € D. Since g(d) < d < n forall d <n, we have 


| [v@ =-[[e@[ [em 
d=1 d=! d=! 
deD d¢gD 


EJ én 
m='|— Sz <mti. 
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< (c3n)™ yr} 
_ cpt" 


bn/2 
< c3"/*n" 


(=) 

< | — ’ 

e 

which is impossible. It follows that there exist at most m integers in [1, n] with 
y(d;) < c3n. In particular, among the t > dn integers a;, there must be at least 


; > 5 én . én 
~ 2] 2 
integers for which y(a;) > c3n, and so 
) ) 
Y> e(ai) > (5) can =n? = cn’, 
a 2 2 
where c; = c36/2. This completes the proof. 


Theorem 2.5 (Erdds—Mahler) Let C}(n) denote the number of integers not ex- 
ceeding n that can be written as the sum of two positive, relatively prime integral 


cubes. Then 
C3(n) > n°, 


ha)= [][, p 


pk la 
p=! (mod ;}) 


Proof. Let 


and let 
1/3 


aj<-:-<a,<n 
be the integers in [1, n!/*] not divisible by 3 such that 
h(a;) < a;!"®. 
Then h(1) = h(2) = 1 and so a; = 2. By Lemma 2.8, we have 
t= H(n'/?) > onl, 
Let x and y be positive integers such that 


x+y=a; forsomei =1,...,t¢. 


Then 


ty <(xt+yP =a <n. 
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Moreover, (x, y) = 1 if and only if (x, a;) = (y, a;) = 1. Therefore, the number 
of pairs x, y of positive integers such that x + y = a;,x < y, and (x, y) = 1 is 
g(a;)/2. 

Let r(m) denote the number of representations of m in the form 


m=x>+y?, 


where x and y are relatively prime positive integers such that (x, y) = 1 and 
x + y =a; for some i. Then 


n 


1 f 
Ri = yo r(m) =5 Y > g(a) > cen?! 
i=2 


m=1 


by Lemma 2.9. 
Let R2 be the number of ordered quadruples (x, y, u, v) of positive integers such 
that 


e+ypeaewe, 


a=xXx+y<utve=a; fori, j € [1, ¢], 
(x, y)=(u, v) =1, 
x<y and u< ov. 


Note that if x7 + y? = u? + v?, then x + y = u +v if and only if {x, y} = {u, v} 


(Exercise 7). Then 
Ro = > (“3”). 


m=] 
Let (x, y, u, v) be a quadruple counted in R2. Since 


a; xe+y? a; uwe+v> 


Maia) x+y ~ Maa u+v 


and a; and a; are not divisible by 3, it follows from (2.18) that a;/h(a;) and 
a;/h(a;) are products of primes p= 2 (mod 3). By Lemma 2.7, 


xe+y? 7 ur+v? -| 
P, xty J Pilsy ) 


if p=2 (mod 3). Therefore, 


aj aj 


h(a;) h(aj) 


Fix the integer a;. Since 


0< (5) h(a;) =a; < ni/3 
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and 
Qj 9/10 
—_» > : ; 
h(a;) 
it follows that 
1/3 


l 


Therefore, to each a; there correspond fewer than 


ni/3 


9/10 
a;! 


different integers a;. By Lemma 2.6, the number of quadruples (x, y, u, v) such 
that x + y =a; and u+v =a; is smaller than 3a? /° Therefore, the number R>,; of 
quadruples (x, y, uw, v) such that x + y = a; satisfies 


0/3 n/3 - 31/3 
i 9/10 7/30” 
a;! a,! 


Ry; < 3a 


and so 


t 
Ry = > Roi 
i=l 
t 4/3 
n 
<3 » 7/30 
i=] 4G; 
1 
° 


-7/30 
1<i<n'/3 


< 3n!/3(q1/3)23/30 
= 3p 2/3)-(7/90). 


< 3n 


Let C;(n) count the number of integers m up to n of the form m = x? + y?, where 
x and y are relatively prime positive integers. Since 


<1+ r 
r 
— 2 


n 


R = 3 r(m) < 3 1+) > ("”) < Ch(n) + Ro. 


m=! m=} m=| 
r(m)>1 r(m)>1 r(m)>1 


for all integers r, we have 


Therefore, 


This completes the proof. 
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The Erd6s—Mahler theorem states that many integers can be written as the sum 
of two positive cubes. Hooley showed that very few numbers have two essen- 
tially distinct representations in this form. To prove this, we need the following 
result of Vaughan—Wooley [130, Lemma 3.5] from the elementary theory of binary 
quadratic forms. 


Lemma 2.10 Let « > 0. For any nonzero integers D and N, the number of 
solutions of the equation 
X*— DY*=N 
with 
max(|X|, |Y|) <« P 

1s 

« (DNP)’, 
where the implied constant depends only on €. 

Proof. See Hua [63, chapter 11] or Landau [78, part 4]. 


The following lemma on “completing the square” shows how to transform 
certain quadratic equations in two variables into Pell’s equations. 


Lemma 2.11 Leta, b,c be integers such that a #0 and D = b* — 4ac # 0. Let 
(x, y) be a solution of the equation 


ax* +bxy +cy*+dx+ey+ f =0. (2.19) 
Let 
X = Dy — 2ae+bd 


and 
Y =2ax+by+d. 


Then (X, Y) is a solution of the equation 
X* — DY’ =N, 
where 
N =(4af — d’)D + (2ae — bd)’. (2.20) 
Moreover, this map sending (x, y) to (X, Y) is one-to-one. 
The number D = b* — 4ac is called the discriminant of equation (2.19). 
Proof. Multiplying equation (2.19) by 4a, we obtain 
4a*x* + 4dabxy + 4acy* + 4adx + 4aey + 4af 
= (2ax + by)? — Dy” + 2d(2ax + by) + 2(2ae — bd)y + 4af 
= (2ax + by +d) — Dy* + 2(2ae — bd)y + (4af — d’) 
= Y? — Dy’ + 2(2ae — bd)y + (4af — d’) 
= 0), 
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where 
Y =2ax+by+d. 


Multiplying by —D, we obtain 
D*y*? — 2(2ae — bd) Dy — DY? — (4af — d”)D 
= (Dy — 2ae + bd)’ — DY? — (4af — d”)D — (2ae — bd)’ 
= X? — DY? — ((4af — d*)D + (2ae — bd)’) 
= X?- DY*—-N 
= (), 
where 
X = Dy — 2ae+bd 
and 
N =(4af — d”)D + (2ae — bd)’. 
The determinant of the affine map that sends (x, y) to (X, Y) is 


0 D 


aa b = —2aD #0 


since a ¥ 0 and D 0, and so the map (x, y) +> (X, Y) is one-to-one. This 
completes the proof. 


Lemma 2.12 Let P > 2, and leta,b,c,d,e, f be integers such that 
max{la|,...,|f |} « P’. 


Let D = b* — 4ac, and define the integer N by (2.20). Let W denote the number 
of solutions of the equation 


ax* +bxy+cy?+dx+ey+f =0 
with max(|x|, |y|) < P. Ifa, D, and N are nonzero, then 
W<|P/’ 
for any € > 0, where the implied constant depends only on €. 


Proof. By Lemma 2.11, to every solution (x, y) of the quadratic equation (2.19) 
there corresponds a solution of the equation 


X? — DY’ EN, 


where 
D=b* —4ac « P* 


and 
N =(4af — d*)D + (2ae — bdy « P®. 
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Moreover, 
X = Dy —2ae+bd < P*|y| « PP 


and 
Y =2ax+by+d < P*(\x|+lyl)< P° 


if max(|x]|, |y|) X P. It follows from Lemma 2.10 that 
W<(DNP?) « P< P*. 
This completes the proof. 


Theorem 2.6 (Hooley—Wooley) Let D(n) denote the number of integers not ex- 
ceeding n that have at least two essentially distinct representations as the sum of 
two nonnegative integral cubes. Then 


Proof. If N has at least two essentially distinct representations as the sum of 
two nonnegative cubes, then there exist integers x), x2, x3, x4 Such that 


3,53 -y3a 73 o 
Xp 4X2 = XZ +XL=N 


and 
0 < x3 <x) <x. <x4< N'”. 


For any number P > 2, let S(P) denote the number of solutions of the equation 


xp +xp =xX3 +43 (2.21) 

that satisfy 
O< x3 <x} <x. < x4 < P. (2.22) 
Then | 
D(n) < S(n'”). (2.23) 


If the integers x1, X2, x3, x4 satisfy (2.21) and (2.22), then x; + x2 4 x3 + x4 by 
Exercise 7, and so 
Xp +X. =x34+x4+h, 


where 
1 < h| < 2P. 


Let T(P, h) denote the number of solutions of the simultaneous equations 


3 3 _ 13 3 
and 
X, tX2 = X3 +x4+h 
with 


0<x <P fori=1,...,4. 
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Choose the integer £ so that 
2°<2P <2". 
Then 
S(P)< > T(P,h) 


1<|h|<2P 
< > > T(P,h) 


O<i<é 2) <|h|<2*! 


« cam | >> riP.n| 
2) <|h|<2i+! 


<logP max rien}. 
1<H<2P oa 


Since x3 is the smallest of the four integers x;, x2, x3, x4, We have 


2x4th >x,+¢xX4th =X, +X2 > 0. 
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For fixed h, we can use x), ..., 4 to define four positive integers u1, U2, 43, and 


y as follows: 
Uy =X, +X2 
U2 =X; — X3 


U3 = X2 — X3 


y =2x, +h, 

where 
1<u; <2P fori =1,2,3 

and 

1<y<4P 
Moreover, 

Uy, tun + U3 = 2x1 +xX2 — 2X3) = 2xgthy=yth 

and 


h (3y? +h?) =h (3(2x4 +h)’ +h’) 
= h(12x? + 12x4h + 4h’) 
= 4(3x2h + 3xgh? +h’) 
= A((x4+h)* — x4) 
= A(x, + x2 — x3)? — xf — xp +3) 


= 12(x?x2 + x1x3 — x5X3 + xpx3 — x°x3 + x4x3 — 2x1X2X3) 


= 12(x; + X2)(x1 — X3)(X2 — x3) 


= 12u,u2U3. 
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Conversely, the numbers uv, v2, u3, and y determine x,, ..., x4 uniquely. It follows 
that 
T(P,h) < U(P,h), 


where U(P, h) denotes the number of solutions of the equations 
Uj, +u2+u3z=yth (2.24) 


and 
12u,u2u3 = h(3y? +h’) (2.25) 


in positive integers u; < 2P and y < 4P. If u; = h for some /, say, u3 = h, then 
u; +u2 =h and 


12ujuz = 3y? +h? = 3u? + 6uj;u2 + 3u3 +h’. 


This implies that 
3(u, — u2)* +h? =0, 


which is impossible since h + QO. Therefore, u; + h for alli = 1,2,3. Let 
U;,U2,U3,h be a solution of equations (2.24) and (2.25) counted in U(P,h). 
Let 

(u3, h) = max{(u;, h): i = 1, 2, 3}, 


where (a, b) denotes the greatest common divisor of a and b. We define 


d3 = (u3,h), 


h 
dy = >> ) 
; (u: i) 

h 
d, = oa 1 
| (ui x) 


d3 = max{d), d2, d3} 


Then 


and d,d2d3 divides h. Let 


h 
§ = didydy’ 
and Us; 
w= fori = 1, 2,3. 
Then 


2P 
(v,,g)=1 and 1l<u,< 7 fori = 1,2, 3. (2.26) 


L 


It follows from (2.25) that 


12,0203 = g(3y” +h’), 
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and so g divides 12, that is, 


fg=12 

for some integer f. Therefore, |h| = |gd,d2d3| < 12d}, and so 

d3 > |h|'. (2.27) 
Since u3 ¥ h, it follows that 

v3 # gd\d). (2.28) 
We can rewrite equation (2.25) in terms of the new variables v;, d;, f, g. Since 

h= gd\d zd 
and 

y= d\v} + d»v2 + d3V3 — h, 
we have 
12uju2U3 = fgdidrd3v, 0203 = fhviv2V3 = h(3y” +h’), 

and so 


fv, 0203 = 3(d) v1 + dyvz + d3v3 — hy +h’. (2.29) 
If we fix the integers d,, dz, d3, f, g, v3, then equation (2.29) becomes a quadratic 
equation in v), v2: 
3d?v? + (6did2 — fv3)v1 v2 + 3d? v3 + 6d,(d3v3 — h)v, 
+6d2(d3v3 — h)v2 + 3(d3v3 — hh)? +h? =0. (2.30) 
The discriminant of this quadratic is 
D = ((6d;dz — fv3)* — 36d?a? 
= fv — 12d\d2f v3 
= f? v3 — didy f? gus 
= f?u3(v3 ~ did2g) 
40 
by (2.28). Similarly, the integer N defined by (2.20) is nonzero, because 
N = (4- 3d? (3(d3v3 — h)? +h”) — (6d; (d3v3 — h))”) D 
+ (2 - 3d? - 6d2(d3 — v3h) — (6dyd2 — f'v3) - 6dy(d3 — v3h))” 
= 12d7h* D + (6d; f'v3(d3v3 — h))’ 
= 12d7h? f?u3(v3 — didzg) + 36d; f° v3d3(v3 — didog))” 
= 12d} d3 f7u3(v3 — didyg) ((didzg)” — 3d;d2gv3 + 33) 
= 3d?d3 f?v3(v3 — didzg) ((didog)” +3 (didog — 2v3) 2) 
#0. 
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Let W(P, d,, dz, d3, f, g, v3) denote the number of solutions of equation (2.30) in 
integers vj, v2 Satisfying (2.26). Since the coefficients of this quadratic equation 
are all < P?, it follows from Lemma 2.12 that 


W(P, dj, d2, d3, f, g, 03) & P*. 
Therefore, 


S(P) < log P | max , Y_ T(P,h) 
H<|h|<2H 


< oes max >, U(P,h) 
P H<|h<2H 


«log P max 
1<H<2P 
H<|h|<2H fg=12  sdjdqd3~h 
d3>max(d) dz) 


W(P, d,, d, d3, ce 8g; v3) 


1<v3 <2P/d3 
3 49d) d 


«log P max ) ) ) ) P* 
1<H<2P 
H<|h|<2H fg=12 8d)d2d3-h = 1 <v3 <2 P/d3 
d3>max(d.d2) v3. 4g¢d} d2 


? Ppite 
< P* max 7 
ISHS2P 2H fg=l2 sits 3 
dz >max(d; ,.d9) 


1 
K p+ max aS 
ISHS2P 1) i coH  sdydyaynh 93 
d3>max(d) ,dz) 


Since the number of factorizations of h in the form h = gd,dzd3 is < |h|°, and 
since 


ds > |hy'”? 
by (2.27), we have 
d3 >. ni/3—-e E 
H<|h|<2H  8d\dzd3=h H<h<2H 
d3>max(d) ,d7) 


and so 
S(P) K pitt max H2/3+€ K pr/3+3e | 
1<H<2P 


Therefore, by (2.23), we have 
D(n) < S(nt’?) <x nr/ ote. 
This completes the proof. 


Theorem 2.7 (Erdos) Almost all integers that can be represented as the sum of 
two positive cubes have essentially only one such representation. 
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Proof. This follows immediately from the remark that there are greater than cn?/? 
integers that can be represented in at least one way as the sum of two nonnegative 
cubes, but there are no more than c’n?/?** = o(n7/*) integers that have two or more 
essentially distinct representations as the sum of two cubes. 


2.5 Notes 


Wieferich’s proof [144] that g(3) = 9 appeared in Mathematische Annalen in 1909. 
In the immediately following paper in the same issue of that journal, Landau [75] 
proved that G(3) < 8. Dickson [24] showed that 23 and 239 are the only pos- 
itive integers not representable as the sum of eight nonnegative cubes. An error 
in Wieferich’s paper was corrected by Kempner [70]. Scholz [108] gives a nice 
version of the Wieferich-Kempner proof. 

Linnik’s proof [81] of the theorem that G(3) < 7 is difficult. Watson [139] 
subsequently discovered a different and much more elementary proof of this result, 
and it is Watson’s proof that is given in this chapter. Dress [25] has a simple proof 
that G(3) < 11. 

Vaughan [126] obtained an asymptotic formula for r3 3(n), the number of repre- 
sentations of an integer as the sum of eight cubes. It is an open problem to obtain 
an asymptotic formula for the number of representations of an integer as the sum 
of seven or fewer cubes. 

It is possible that every sufficiently large integer is the sum of four nonnegative 
cubes. Let E(x) denote the number of positive integers up to x that cannot be written 
as the sum of four positive cubes. Davenport [17] proved that E43(x) « x29/30+¢, 
and so almost all positive integers can be represented as the sum of four positive 
cubes. Briidern [6] proved that 


E43(x) <K< x 31/42tE 


There are interesting identities that express a linear polynomial as the sum of 
the cubes of four polynomials with integer coefficients. Such identities enable us 
to represent the integers in particular congruence classes as sums of four inte- 
gral cubes. See Mordell [85, 86], Demjanenko [20], and Revoy [101] for such 
polynomial identities. 

Theorem 2.5 was first proved by Erdos and Mahler [31, 35]. The beautiful 
elementary proof given in this chapter is due to Erdos [31]. Similarly, Theorem 2.6 
was originally proved by Hooley [57, 58]. The elementary proof presented here is 
due to Wooley [149]. For an elementary discussion of elliptic curves and sums of 
two cubes, see Silverman [115] and Silverman and Tate [116, pages 147-151]. 

Waring stated in 1770 that ¢(2) = 4, g(3) = 9, and g(4) = 19. The theorem that 
every nonnegative integer is the sum of 19 fourth powers was finally proved in 
1992 in joint work of Balasubramanian [2] and Deshouillers and Dress [21]. 
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2.6 


2. Waring’s problem for cubes 


Exercises 


. Prove that 


33+ 44+5=6 
is the only solution in integers of the equation 


(x — 3 +(x —2)P' +(x — 1 =x’. 


. Let s(N) be the smallest number such that N can be written as the sum of 


s(N) positive cubes. Compute s(N) for N = 1,..., 100. 


. Prove that s(239) = 9, that is, 239 cannot be written as a sum of eight 


nonnegative cubes. 


. Show that none of the following numbers 


IS 22 50 114 167 
175 186 212 231 238 
303 364 420 428 454 


can be written as a sum of seven nonnegative cubes. 


. Show that none of the following numbers 


79, 159, 239, 319, 399, 479, 559 


can be written as a sum of 18 fourth powers. 


. Let v(3) denote the smallest number such that every integer can be written 


as the sum or difference of v(3) nonnegative integral cubes. 


(a) Prove that 
4 < u(3) < g(3). 


(b) Prove that 
v(3) < 5. 


Hint: Use the polynomial identity 
6x =(x +1)? +(x — 1)° — 2x? 
and the fact that x = (N — N°)/6 is an integer for every integer N. 


It is an unsolved problem to determine whether v(3) = 4 or 5. This is called 
the easier Waring’s problem for cubes. 


. Let x, y, u, v be positive integers. Prove that if x+y =v+vandx>+y? = 


u> + v>, then {x, y} = {u, v}. 


8. 


10. 


11. 
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(Von Sterneck [136]) Using a computer, calculate s(n) for n up to 40,000. 
Verify the results of Lemma 2.4. 


. (Mahler [82]) Prove that 1 has infinitely many different representations as 


the sum of three cubes. Hint: Establish the polynomial identity 
(9x*)* + (3x — 9x“)? +(1 —9x°)? = 1. (2.31) 
Prove that 
(9m*) + 3mn? — 9m*y + (n* —9m?ny =n". 


Let r3,3(N) denote the number of representations of N as the sum of three 
nonnegative cubes. Prove that if N = n!* for some positive integer n, then 


733(N) > 93 Nt/?2, 


Note: This is Mahler’s counterexample to Hypothesis K of Hardy and Lit- 
tlewood [49]. 


(Elkies and Kaplansky [27]) Verify the following polynomial identities: 
8(x? + y? — 2°) = (2x + 2y)* + (2x — 2y)? — (2z)%, 


2x +1 =(x? — 3x7 4+x)% + (x? — x — 1)? — (x? — 2x), 
2(2x +1) = (2x? — 2x” — x)? — (2x? — 4x? — x +1)? — (2x? — 2x — 1), 
A(2x +1) = (x2 +x +2) +(x? — 2x — 1)? — (x7 +19. 


Show that every integer N, positive or negative, can be written uniquely in 
the form 
N = 872" (2m + 1), 


where q > 0,r € {0,1,2}, and m € Z. Prove that every integer N can be 
written in the form 
N =a’ +b* —’, 


where a, b, c are integers. 
Let a be a positive rational number. Consider the equations 
a=xtry4+2 


a=(x+y+z) —3(ytz(z+x(x ty) 
8a =(ut+utw) — 24uvw. 


Prove that if any one of these equations has a solution in positive rational 
numbers, then each of the three equations does. 
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12. Leta be arational number. Let r be any rational number such that r + 0 and 


13. 


t= a 
~ -72r3 


4-1. 


For any rational number w, let 
241? 
u={|—— -l]w 
(t +1)? 
| 24t 
v= | ——— ] w. 
(t+ 1) 


3 
(u+ut+w) — 24uvw = 8a * . 
r(t +1) 


Let w = r(t + 1). Prove that there exist rational numbers x, y, z such that 


Prove that 


u=ytz 
v=Z+Xx 
w=xt+y 


and 

a=xtyr 42. 
This proves that every rational number can be written as the sum of three 
rational cubes. 


Let a be a positive rational number. Show that it is possible to choose r in 
Exercise 12 so that 
a=xr+y?+2, 


where x, y, Z are positive rational numbers. This proves that every positive 
rational number can be written as the sum of three positive rational cubes. 


3 
The Hilbert—Waring theorem 


Nous ne devons pas douter que ces considérations, qui permettent ainsi 
d’obtenir des relations arithmétiques en les faisant sortir d’identités 
ou figurent des intégrales définies, ne puissent un jour, quand on en 
aura bien compris de sens, étre appliquées a des problémes bien plus 
étendus que celui de Waring. ! 


H. Poincaré [96] 


3.1 Polynomial identities and a conjecture of Hurwitz 


Waring’s problem for exponent k is to prove that the set of nonnegative integers 
is a basis of finite order, that is, to prove that every nonnegative integer can be 
written as the sum of a bounded number of kth powers. We denote by g(k) the 
smallest number s such that every nonnegative integer is the sum of exactly s kth 
powers of nonnegative integers. Waring’s problem is to show that g(k) is finite; 
Hilbert proved this in 1909. The goal of this chapter is to prove the Hilbert—-Waring 
theorem: the kth powers are a basis of finite order for every positive integer k. 
We have already proved Waring’s problem for exponent two (the squares) and 
exponent three (the cubes). Other cases of Waring’s problem can be deduced from 


'We should not doubt that [Hilbert’s] method, which makes it possible to obtain arith- 
metic relations from identities involving definite integrals, might one day, when it is better 
understood, be applied to problems far more general than Waring’s. 
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these results by means of polynomial identities. Here are three examples. We use 
the notation 


(xj tx. +---+x,)' = > (x1 + €nX2 +++ + €4x))*. 


E2,...,€,=41 
Theorem 3.1 (Liouville) 
2, 12,12, 2\2 _ I 4 4 
1<i<j<4 1<i<j<4 


is a polynomial identity, and every nonnegative integer is the sum of 53 fourth 
powers, that is, 


e(4) < 53. 
Proof. We begin by observing that 
(4,4 x2)‘ = (xX; + x7)‘ +(x) — x2)4 = 2x} + 12x?x3 + 2x5, 


and so 


> @tx)t= YO @txpt+ YO Gx) 


i<i<j<4 I<i<j<4 I<izj<4 


= > (2x; + 12x}x7 + 2x7) 


1<i<j<4 
4 
=6) x; +12 ) xpx* 
i=] 1<i<j<4 
_ 2 2 2 2\2 
=6 (xp +x5+x5+x%) . 
This proves Liouville’s identity. 


‘ej , yea yea yea x2 
Let a be a nonnegative integer. By Lagrange’s theorem, a = xj +x5 +x3 +X 
is the sum of four squares, and so 


2 
6a” = 6 (xf +.x7 +x5 +27) 


Yo (i +xj)*+ >, Gi -x,) 


1<i<j<4 I<i<j<4 


is the sum of 12 fourth powers. Every nonnegative integer n can be written in the 
form n = 6g +r, where q > 0 and0 <r < 5. By Lagrange’s theorem again, we 
have q =a; +---+a2, and so 6g = 6a? +---+6a? is the sum of 48 fourth powers. 
Since r is the sum of 5 fourth powers, each of them either 0* or 1+, it follows that 
n is the sum of 53 squares. This completes the proof. 

The proofs of the following two results are similar. 
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Theorem 3.2 (Fleck) 


(xf + x3 + x5 + x2)" 
1 


6 1 6 3 
“ > (x; xj + xx) +3 > (x; + x;) += Dox 


1<i<j<k<4 1<i<j<4 1<i<4 


is a polynomial identity, and every nonnegative integer is the sum of a bounded 
number of sixth powers. 


Theorem 3.3 (Hurwitz) 
2 2 2 2\4 
(xp +23 +23 + x4) 


1 ] 
= 99p 1 $2 + x3 + x4)" + > (2x; +x; +x)" 


1<i<j<k<<4 


+a Y> (x +4;) *+ ye Oni 


4 jaj<s 1<i<4 


is a polynomial identity, and every nonnegative integer is the sum of a bounded 
number of eighth powers. 


Suppose that 


2 2 2 2k 
(xf +- -+ x2)" Sa; ( a; (bj,1x7 + Dj.2x5 + :Dj.3x3 +d; 4X4) (3.1) 


i=] 


for some positive integer M, integers b;, ;, and positive rational numbers a; . Hurwitz 
observed that this polynomial identity and Lagrange’s theorem immediately imply 
that if Waring’s problem is true for exponent k, then it is also true for exponent 2k. 
Hilbert subsequently proved the existence of polynomial identities of the form (3.1) 
for all positive integers k, and he applied it to show that the set of nonnegative 
integral kth powers is a basis of finite order for every exponent k. This was the first 
proof of Waring’s problem. In the next section, we obtain Hilbert’s polynomial 
identities. 


3.2 Hermite polynomials and Hilbert’s identity 


For n > 0, we define the Hermite polynomial H,,(x) by 


noo-(Z PL) 


The first five Hermite polynomials are 


Ho(x) = ] 
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Ai(x)=x 
Hx(x) = x? — ; 


3 

Hy(x) = x° — 5x 
4 2,3 
Hy4(x) =x" — 3x +7: 


Since 


meo=(3') a (“ae (&)) 
_ (3) (x)e" = (e") _2 (3)" a (e") 


= 2x Hy, (x) _ 2An+1 (x), 


the Hermite polynomials satisfy the recurrence relation 
1 / 
Hysi(%) = XH q(X) ~ 5 Hy (2). (3.2) 
It follows that H,,(x) is a monic polynomial of degree n with rational coefficients 
and that H,,(x) is an even polynomial for n even and an odd polynomial for n odd. 
Lemma 3.1 The Hermite polynomial H,(x) has n distinct real zeros. 


Proof. This is by induction on n. The lemma is clearly true for n = 0 andn = 1, 
since H,(x) = x. Letn > 1, and assume that the lemma is true for n. Then H,,(x) 
has n distinct real zeros, and these zeros must be simple. Therefore, there exist 
real numbers 


Bn <-++ < Bo < Bi 
such that 
A, (B;) =0 
and 
H,,(B;) #0 
for j =1,...,n. Since H,,(x) is a monic polynomial of degree n, it follows that 
lim H,,(x) = 00, 
XC 
and so 


H,,(B;) > 0. 


Since the n — 1 distinct real zeros of the derivative H/ (x) are intertwined with the 
n zeros of H,,(x), it follows that 


(-1)*1H7'(B;) > 0 
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for j = 1,...,n. The recurrence relation (3.2) implies that 


1 ] 
Hoyos (Bj) = Bj Hn(B;) — 5 Hy(Bj) = ~5 Hy (B) 


and so 


tA (— » jt+1 
(—1) Asi (Bj) = —~— H,, (Bj) > 


for j =1,...,n. Therefore, for 7 = 2,...,n, Hn+1(x) has a zero B; in each open 
interval (6;, B;-1). Since lim, 60 Hn+1(%) = 00 and AH,.1(B1) < 0, it follows that 
An+1(x) has a zero By > f,. If n is even, then H,41(B,) > 0. Since n + 1 is odd, 

HA,41(x) is a polynomial of odd degree, and so lim,_, 99 Hn+1(x) = —oo. It follows 
that H,,41(x) has a zero B*,, < B,. Similarly, if n is odd, H,+1(B,) < 0 and the 
even polynomial H,,4:(x) has a zero B*, < B,. Thus, H,,4;(x) has n + 1 distinct 
real zeros. This completes the proof. 


Lemma 3.2 Let n > 1 and f(x) be a polynomial of degree at most n — 1. Then 
0° 2 
| e* H,(x)f(x)dx =0. 
—0oo 


Proof. This is by induction on n.) If n = 1, then H,,(x) = x and f(x) is constant, 
say, f(x) = a, SO 


[ eH (x) f (x)dx = Ao [ e xdx =0. 


1o,@) —00O 


Now assume that the lemma is true for n, and let f(x) be a polynomial of degree 
at most n. Then f’(x) is a polynomial of degree at most n — 1. Integrating by parts, 
we obtain 


ore) _14\"+1 poo antl 
[- en Hiss (x) f (x)dx a (=) [. — (e*") f (x)dx 
—1\"*F £e d™ 7 ay 
=) [. ax (c \f (x)dx 


: (=) / * eo? Hale) (add 
0 


This completes the proof. 


Lemma 3.3 Forn > 0, 


a) i _ ifn is even 
—x? My = 1 2/2! 
[. cS GS | 0 if n is odd. 63) 
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Proof. This is by induction on n. For n = 0, we have 
oO 
| e dx = Ja 
—0o 


and so co = 1. For n = 1, the function e-*’x is odd, and so 
©, @) 
| e*'xdx =0 
—0o 
and c; = 0. Now letn > 2, and assume that the lemma holds for n — 2. Integrating 
by parts, we obtain 


_(n- 1 (n — 2)! 
- ( 2 2"-2 ((n — 2)/2)! 


n! 
~ 2" (n/2)!° 
This completes the proof. 
Lemma 3.4 Letn > 1, let B,,..., B, ben distinct real numbers, and let co, c1, 
. s+) Cn—1 be the numbers defined by (3.3). The system of linear equations 
n 
\\Bixj=c  fork=0,1,...,2-1 (3.4) 
j=l 
has a unique solution p,..., Pn. If r(x) is a polynomial of degree at most n — 1, 
then 
- 1 ae Od 
r(B;)e; = — e r(x)dx. 
d JI) Jt oo 
Proof. The existence and uniqueness of the solution (1, ..., 0, follows imme- 


diately from the fact that the determinant of the system of linear equations 


xX; + X2 Feet Xn = Co 
Bix; + Box2 test BnXn = Ci 


Bex, + Box. te--+ Bex, = © 


n—-1 


n—-1 n—-| 
1 *1 =F B; X2 te": Bro Xn = Cn-1 


+ 
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is the Vandermonde determinant 


Be 
B; B3 —_ n = I] (6; — Bi) 40. 
1<i<j<n 
n—1 n—-1 _— n—1 
1 2 n 
Let r(x) = 779 a,x*. Then 
n n n-l 
>57(B))P; = D>) Bi 2; 
j=l j=l k=0 
n—-1 n 
=) % DBP 
k-0 j=l 


n—1 
= ) ArCr 
k=0 


This completes the proof. 


Lemma 3.5 Letn > 1, let B,,..., B, be the n distinct real roots of the Her- 
mite polynomial H,,(x), and let p,,..., Pn be the solution of the system of linear 
equations (3.4). Let f(x) be a polynomial of degree at most 2n — 1. Then 


n 1 00 , 
 fBi)0) = Fe / e-* F(x)dx. 
j=l —00 


Proof. By the division algorithm for polynomials, there exist polynomials q(x) 
and r(x) of degree at most n — 1 such that 


F(x) = An(x)q(x) + r(x). 


Since H,(8;) = 0 for j =1,...,n, we have 
F (Bj) = An(Bj)q(B;) + 7(B;) = r(B;), 


and so, by Lemma 3.4 and Lemma 3.2, 


n 


S> FB i)e; = > r(B)p; 
j=l 


j=l 
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1 oO 


2 
=— e* r(x)dx 
Jt J—co 
— | eH, (xq(dx+ [ e*reva 
= —— e n(x)q(x)dx + —= e” r(x)dx 
VT J—co VI Joo 
— [fond 
= — e x)dx. 
VT J—co 
This completes the proof. 
Lemma 3.6 Letn > 1, let B,,..., B, be the n distinct real roots of the Hermite 
polynomial H,,(x), and let p,,..., Pn be the solution of the linear system (3.4). 
Then 


pi > 0 fori=1,...,n. 
Proof. Since 


H,(x) = | | — Bj), 


j=l 
it follows that, fori =1,...,7, 
A, (x) ; - 2 
fi(x) = =| [@- 86) 


x — B; j-l 
i#i 


is a monic polynomial of degree 2n — 2 such that f;(x) > 0 for all x. Therefore, 


sz | fede > 0. 


Since f;(8;) > O and f;(B;) = 0 for 7 #7, we have, by Lemma 3.5, 


filBio:i = >> fi(By)0; 


jel 
| "food 
= —— e ‘(x)dx 
Vm I-00 

> 0. 
This completes the proof. 
Lemma 3.7 Letn > 1, and let co, c),..., Cn—1 be the rational numbers defined 
by (3.3). There exist pairwise distinct rational numbers By, ..., Bt and positive 
rational numbers pj, ..., 0; such that 


n 
YB pt=cq fork =0,1,...,n-1. 
j=l 
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Proof. By Lemma 3.4, for any set of n pairwise distinct real numbers 6), ..., Bn, 
the system of n linear equations in n unknowns 


S > Bix; = ck fork =0,1,...,n—1 


has a unique solution (;,..., 0). Let ® be the open subset of R” consisting 
of all points (6;,..., By) such that B; + 6; fori # j, and let ® : R — R" be 
the function that sends (f),..., B,) to (01, ..., On). By Cramer’s rule for solving 
linear equations, we can express each p; as a rational function of B),..., B,, and 
so the function 


D(B;,---, Bn) = (1s «+ +s Pn) 


is continuous. Let R{, be the open subset of R” consisting of all points (x), ..., Xn) 
such that x; > 0 fori =1,...,n. By Lemma 3.6, if B,,..., B, are the n zeros of 
H,(x), then (6), ..., Bn) € FR and 


@(Bi,..., Bn) = (P1, +++ Pn) € RY. 


Since R" is an open subset of R”, it follows that b~'(R") is an open neighborhood 
of (61, ..., By) in R. Since the points with rational coordinates are dense in FR, it 
follows that this neighborhood contains a rational point (87, ..., B7). Let 


(p?,..., pt) = ®(B%,..., BX) € RY. 


Since each number p* can be expressed as a rational function with rational co- 
efficients of the rational numbers fy, ..., BF, it follows that each of the positive 
numbers p* is rational. This completes the proof. 


Lemma 3.8 Letn > 1, let co,ci,...,Cn—1 be the numbers defined by (3.3), let 
Bi, ---, Bn be n distinct real numbers, and let p\,..., Pn be the solution of the 
linear system (3.4). For every positive integer r and form =1,2,...,n—1, 


9\m/2 
Cm (xp +--+ +x;) =) Soy Pj, (Bj x1 +-- + BjXr)” 
j=l jr=l 
is a polynomial identity. 


Proof. The proof is an exercise in algebraic manipulation and the multinomial 
theorem. We have 


»- Spy "Pj, (Bj,x1 +: +++ Bj,%r)" 


j= Jr=1 
=> Sp; "++ Pj, > 801)" + (Bj x)" 
j=! Jr=! fy tetera Ly! 


pH, 20 
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n n by Uy 
=m! vee 1 Bly. aeeaae Mr, 
= ° Ly! yi Pj, pL ! Ir Pj, 
. = ° = ove =m, ° re 
jal jem Mt esurem 
= ml! : eee - : xj" Mi . 
) » » at i ji Pi 
er ar 1 jx : 
Hy 1; 20 ji=! Jr=1 i=1 
r Hi n 
XxX: 
_ H Bi 
=m! >) TTS yaa, 
Mytetur=m fo] Lj: j=l 
Hj, 20 


r Hi 


| 
Mytethrem ja] Lj . 
Hj 20 


By Lemma 3.3, cm = 0 if m is odd. If m is odd and fy +--+++, = m, then py; 
must be odd for some i, and so 


Yo De oj 0; (Bj,x1 +--+ +B;x,)” =0. 


ji=l Jr=l 


This proves the lemma for odd m. If m is even, then we need only consider parti- 
tions of m into even parts 4; = 2v;. Inserting the expressions for the numbers Ch 
from (3.3), we obtain 


n n 
yoy py, Oj (Bj,x1 +--+ Bj x,)” 
j=! Jr=l 


r 


2v; 
-mi 2 
24 +0427 =m [=] (2v;)! 
vy; 20 
Qv: 
r (20; )! xy" 
=m DS TY 


2V; yp. 
Vptetvpem/2 f=] 2 vj! (2v;)! 


vj 20 
m! ne xe 
~ Fm r 
2 Vytetyp=m/2 jx] Vie 
vj 20 


m! 


ne  (x?)" 
2mm fai mr!" » TI i! 


Vyt-+vp=m/2 j=] Vj: 


v; 20 
(m/2)! 2\¥1 2\ Yr 
= Cy, Sey 1) +++ (x7) 
Vy tet em /2 1-' °° Vy: 
v;>0 


= Cm (xf +-- yn? 


This proves the polynomial identity. 
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Theorem 3.4 (Hilbert’s identity) For every k > 1 andr > 1 there exist an 
integer M and positive rational numbers a; and integers b;,; fori =1,..., M and 
=1,...,r such that 


(x? -- +x?) “Dal bj x, +: + DipXp) . (3.5) 


Proof. Choose n > 2k, and let Bf,..., BT, pj, ---, P, be the rational numbers 
constructed in Lemma 3.7. Then Bf, ..., BF are pairwise distinct and p/,..., 0; 
are positive. We use these numbers in Lemma 3.8 with m = 2k and obtain the 
polynomial identity 


2 2k 
C2k (x; bee tx ‘yo. Laie 3, (Bj, +: ++ BF x,) 


ji=l Jr= 


Let g be a common denominator of the n fractions By,..., By. Then qB; is an 
integer for all 7, and 


2ST. ST Oh Pie 2k 
(xf +--+ +x;) -)> > (qB; x1 +-- - + gB* x,) 


is a polynomial identity of Hilbert type. This completes the proof. 


Lemma 3.9 Let k > 1. If there exist positive rational numbers a,, ...,@y Such 
that every sufficiently large integer n can be written in the form 

M 

k 
n= yay, ; (3.6) 

i=l 
where X,,...,Xm are nonnegative integers, then Waring’s problem is true for 
exponent k. 


Proof. Choose no such that every integer n > no can be represented in the 
form (3.6). Let g be the least common denominator of the fractions a),..., ay. 
Then ga; € Z fori = 1,...,M, and qn is a sum of en 194i nonnegative kth 
powers for every n > no. Since every integer N > qno can be written in the form 
N =qnt+r, where n > no and0 <r <q —1, it follows that N can be written as 
the sum of a 19a; +q — 1 nonnegative kth powers. Clearly, every nonnegative 
integer N < qgno can be written as the sum of a bounded number of kth powers, 
and so Waring’s problem holds for k. This completes the proof. 

The following notation is due to Stridsberg: Let a “1 x; be a fixed diagonal 
form of degree k with positive rational coefficients a), ..., ay. We writen = )°(k) 
if there exist nonnegative integers x;,..., x, such that 


M 
n=) ajxt. (3.7) 
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We let > °(k) denote any integer of the form (3.7). Then )-(k) + )((k) = >-(k) 
and )°(2k) = >\(k). Lemma 3.9 can be restated as follows: If n = }°(k) for every 
sufficiently large nonnegative integer n, then Waring’s problem is true for exponent 
k. 


Theorem 3.5 If Waring’s problem holds for k, then Waring’s problem holds for 
2k. 


Proof. We use Hilbert’s identity (3.5) for k with r = 4: 


24 2\k 
(x; + ++ +x2) Sai ( bx) +> ++ Bi4x4) 


i=] 


Let y be a nonnegative integer. By Lagrange’s theorem, there exist nonnegative 
integers x), X2, X3, X4 Such that 


Y=xpexg tus +x2, 


and so 
y* = Sait 22 (3.8) 
i=] 
where 
Zi = Dj, xXy +--+ + Dj 4x4 


is a nonnegative integer. This means that 


y* =) 2k) 


for every nonnegative integer y. If Waring’s problem is true for k, then every 
nonnegative integer is the sum of a bounded number of kth powers, and so every 
nonnegative integer is the sum of a bounded number of numbers of the form } °(2k). 
By Lemma 3.9, Waring’s problem holds for exponent 2k. This completes the proof. 


3.3. A proof by induction 


We shall use Hilbert’s identity to obtain Waring’s problem for all exponents k > 2. 
The proof is by induction on k. The starting point is Lagrange’s theorem that every 
nonnegative integer is the sum of four squares. This is the case where k = 2. We 
shall prove that if k > 2 and Waring’s problem is true for every exponent less than 
k, then it is also true for k. 


Lemma 3.10 Letk > 2 and0 < é < k. There exist positive integers Bo.¢, By.c, 
., Be_;,¢ depending only on k and £ such that 


£—~1 
xe pk 4 > B; xT _ ) \(2k) 
i=0 
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for all integers x and T satisfying 
x? <T. 
Proof. We begin with Hilbert’s identity for exponent k + £ with r = 5: 


Me 
2 2k+20 
(xj tee + x2)! = oa (b;,1%4 +--+ +; 5X5) ; 
i=] 


where the integers M, and 5, ; and the positive rational numbers a; depend only 
on k and £. Let U be a nonnegative integer. By Lagrange’s theorem, we can write 
U =x} +x3 +x3 +x 

for nonnegative integers x;, X2, x3, x4. Let x5 = x. We obtain the polynomial 
identity 

Mz 

(x? + US = Sa; (bix +05)", (3.9) 

i=l 
where the numbers M;, a;, and b; = b;,5 depend only on k and £, and the integers 
Cj = b;.\x; +--+ +b;,4x4 depend on k, £, and U. Note that 20 < k + @ since £ < k. 
Differentiating the polynomial on the left side of (3.9) 22 times, we obtain (see 
Exercise 6) 


2£ £ 
Fit ((x? + uy*) — > Aj.ex”! (x? + Ui, 
i=0 


where the A; ¢ are positive integers that depend only on k and £. Differentiating 
the polynomial on the right side of (3.9) 22 times, we obtain 


2 Me 
Fqil ya; (bix + c;)*** 
Xx 


i=] 


Me 
=) ‘(2k + 1)(2k +2)+++ (2k + 20)b}4a;(b;x + ci) 
i=] 


Me 
= > ai (bx + ¢;)* 


i=] 
Me 
_ ) 1. 2k 
~ a; yi ’ 
i=1 
where y; = |b;x + c;| 1s a nonnegative integer and 


al = (2k + 1)(2k +2)-+- (2k + 2£)b*a; 


is a nonnegative rational number depending only on k and £. It follows that, if x 
and U are integers and U > Q, then there exist nonnegative integers y),..., yy, 
such that 


Me 


G 
> Aj. ex7! (x7 +Ut = Si aty?. 
i=0 


i=] 
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Let x and T be nonnegative integers such that x? < T. Since Ag, is a positive 
integer, it follows that x7 < A, eT, and so 


U =AgeT — x? 


is a nonnegative integer. With this choice of U, we have 


7A, ex (x? + UY = yA, ex (AreT 
i=0 
es 3 Ait a xi pk 
i=(0 


£ 
a A ) AieAge |x Tk 
i=0 


£ 
k—€+1 . 2i pk-i 
= Ate Bj ex a an 


i=0 
where By ¢ = 1 and 
Bie = AeA 
iS a positive integer fori =0,...,2— 1. Let 
, 
a a AG et 41" 


Then 
xe TK —£ “ym nor = we 2k = ) (2k). 
i=] 
This completes the proof. 


Theorem 3.6 (Hilbert-Waring) The set of nonnegative kth powers is a basis of 
finite order for every positive integer k. 


Proof. This is by induction on k. The case k = 1 is clear, and the case k = 2 
is Theorem 1.1 (Lagrange’s theorem). Let k > 3, and suppose that the set of £th 
powers is a basis of finite order for every £ < k. By Theorem 3.5, the set of (2€)-th 


powers is a basis of finite order for £ = 1,2,..., k — 1. Therefore, there exists an 
integer r such that, for every nonnegative integer n and for £ = 1,...,k — 1, the 
equation 


na xr a... + x7 


is solvable in nonnegative integers x;,7,..., X;,¢. (For example, we could let r = 
max{g(2@): £=1,2,...,k —1}.) 
Let T > 2. Choose integers C),..., Cx—; such that 


0<C,<T for@=1,...,k—1. 


3.3. A proof by induction 89 


There exist nonnegative integers x; ¢ for j =1,...,r and£=1,...,k — 1 such 
that 
xr xh me Chie. (3.10) 
Then 
, 
x" < xi, <Cy2 < T 


for j=1,...,7,€=1,...,k—1, andi =1,..., £. By Lemma 3.10, there exist 
positive integers B; ¢ depending only on k and £ such that 


£—1 
xT! + > Bi ext Th! =) (2k) = )(R). (3.11) 
i=0 


Summing (3.11) for 7 = 1,...,7 and using (3.10), we obtain 
£-1 r 
CyeT + D> Bie TD x7, 
=0 j=l 


€-1 r 
= C,_.T** + Tk 1 ) Bi eT ) xe 
'=0 jel 


ke k—041 
= CyeT* + Dye T 


=), 


l—1 r 

@-1-i 2i 

Dy_-e41 = y Bi eT > xi) 
i=0 j=l 


for 2=1,...,k —1. The integer D,_¢4; is completely determined by k, £, 7, and 
C,_¢ and is independent of C,_; fori # 2. Let 


B* = max{B;,:€=1,...,k —landi =0,1,...,@— 1}. 


where 


Then 
O < Cy_eT*! + Dye TO 


£—1- r 
= Cy_-0T** + ) B,¢T* ) xi 
i=0 j=l 


£-1 
< B* (x +rT* + > ) 


i=] 


£-1 
_— B* (-r + Tk 1 > r) 
i=0 


. \ TK! 
< B"| rT’ + 
T -—1 


< (r +2)B*T*, 
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since T/(T — 1) < 2 for T > 2. Let 


C, = D; =0. 
Then 
k—1 k 
Do (CaeT A! + Dees TH) = (Ce + DT! = YH) 
t=] l=] 
and 


k 
O< >) (Ce+ Di) T! < 1) +2)B*T* = E*T*, 
=] 


where the integer 
E* =(k — 1)(r +2)B* 


is determined by k and is independent of T. If we choose 
T > E*, 
then 


k 
O< Yi (Ce +D,)T® < E*T* < T!, 
l=] 


and so the expansion of eel (Ce + Dz) T* to base T is of the form 


k 

(Cet De) To = EyT +--+ + Egy Th) + ExT", (3.12) 

t=1 
where 

O<£&, <T fori=1,...,k-—1 
and 
O< E;, < E*. 
In this way, every choice of a (k — 1)-tuple (C;,..., Cx_1) of integers in {0, 
1,..., 7 — 1} determines another (k — 1)-tuple (£),..., Ex-1) of integers in 
{0O, 1,..., 7 — 1}. We shall prove that this map of (k — 1)-tuples is bijective. 
It suffices to prove it is surjective. Let (£),..., Ex-1) be a (k — 1)-tuple of 

integers in {0,1,..., 7 — 1}. There is a simple algorithm that generates inte- 


gers C,, C2,..., Cx_-1 € {0,1,..., Z — 1} such that (3.12) is satisfied for some 
nonnegative integer E, < E*. Let C, = E; and J, = 0. Since D, = 0, we have 


(C, + DT = EyT + bT’. 


The integer C, determines the integer D,. Choose C2 € {0,1,..., 7 — 1} such 
that 
C2 +D,+h= Ep (mod T). 
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Then 
C2, + Do+1, = £2+ bT 
for some integer /3, and 
2 2 

(Cet De)T! =) EeT! + BT”. 

é=1 é=1 
The integer C2 determines D3. Choose C3 € {0, 1,..., J — 1} such that 

C3 + D3 + Il, = E3 (mod T). 


Then 
C3+D3+h = £3+ 14T 


for some integer J,, and 
3 3 
(Cet DoT! = So EcT! + 14T*. 
(=1 e=1 


Let 2 < j < k — 1, and suppose that we have constructed integers /; and 


Ci,...,Cj-1 €{0,1,...,7-1]) 


such that 
j-l j-l | 
S\(Ce+ DT! = 0 EcT! + 1,T!. 
é=1 e=l 
There exists a unique integer C; € {0, 1,..., T — 1} such that 
C; + D; +1; = Ej (mod T). 
Then 


C,+Dj +1, = Ej, + UjunT 


for some integer /;,), and 
j j | 
(Ce + DoT! = D0 EeT! + Tj T™". 
t=1 t=1 


It follows by induction that this procedure generates a unique sequence of integers 
C1, Co,..., Cy_-1 € {0, 1,..., T — 1} such that 


k—1 k-1 
Y (Cet DOT! = Y\ ET! + iT". 
t=] f=] 


Since C, = 0 and C,_; determines D;, we have 


k k-1 k 
0< Y(Ce +D,)T! = > E,T! +(Dy+1,)T* = > E.T! < E*T*, 
l=] l=] l=] 
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where D; + I, = Ex. Since 


k—] 
0<) ET <T*, 
l=] 
it follows that 
O< ki < E* 
and 
k—1 
>: E,T¢+E*T* <(1+E*)T* < 2E*T". (3.13) 
f=] 
Recall that 


k k 
SET! =) (Ce + DOT! = Dh). 
l=] 


e=1 
Since E* depends only on k and not on T, it follows that 


(E* — E,)T* = }(k), 
and so 


k-1 
\ | E:T! + E*T* = x3) (3.14) 


f=] 


for every (k — 1)-tuple (EZ), ..., Ex-1) of integers E, € {0,1,..., T —1}. Choose 
the integer 7p > SE* so that 


A(T +1) < 5T* for all T > Tp. 


We shall prove that if T > Tp and if (Fo, Fi, ..., Fy-1) is any k-tuple of integers 
in {0,1,..., 7 — 1}, then 


Fo+ F\T +---+ Fy)T*' + 4E*T* =) (b). 


We use the following trick. Let Ep € {0, 1,..., T — 1}. Applying (3.13) with T +1 
in place of 7, we obtain 


E\(T +1)+ E*(T +1" < (7 +1)? +E*(T +1) 
< (1+ E*)(T +1) 
< 2E*(T +1). (3.15) 


Applying (3.14) with T + 1 in place of T, we obtain 
E\(T +1)+ E*(T +1)' =). (3.16) 
Adding equations (3.14) and (3.16), we see that for every choice of k integers 


Eo, E\,---, Ex-1 € (0, 1,...,T — 1}, 
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we have 


F* = (E\T +--+ + EyyT* | + E*T*) + (EQ(T +: 1) + E*X(T + 1) 
k—1 k 
= (Ey + E*)+(E, +E) +kE*)T +) > @ + ()) E") T'+2E*T* 
f=2 


= Yi . 


Moreover, it follows from (3.13) and (3.15) that 
O< F* < 4E*(T +1) <S5E*T* < T™! 
since 4(T + 1)* < 5T* and T > To > 5E*. Given any k integers 
Fo, Fi,..., Fr_-1 € {0,1,...,T — 1}, 
we can again apply our algorithm (see Exercise 7) to obtain integers F;, and 
Eo, FE, E2,..-, Ex-1 € {0,1,...,T — 1} 
such that 


Fot+ F\T +---+F\T'+FT* 
= E,\T +---+ EyT 1+ E*T* + Ej(T +1)+ E*(T +1) 


=), 
where F; is an integer that satisfies 
0 < F, < 5E*. 
After the addition of (5SE* — F,)T* = >“(k), we obtain 
Fo+ FT +---+ Fy iT*! +5E*T* = Sk) 


for all T > Tp and for all choices of Fo, F;,..., Fy_1 € {0,1,..., 7 — 1}. This 
proves that n = )\(k) if T > Tp and 


SE*T* <n < (5E*+1)T*. 
There exists an integer 7; > 7p such that 
SE*(T +1). <(5E*+1)T* — forall T > 7. 
Then n = )°(k) if T > T; and 
SE*T* <n <5E*(T +1). (3.17) 


Since every integer n > 5E *Té satisfies inequality (3.17) for some T > 7), we 
have 

n=) (k) foralln > 5E*TY. 
It follows from Lemma 3.9 that Waring’s problem holds for exponent k. This 
completes the proof of the Hilbert-Waring theorem. 
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3.4 Notes 


The polynomial identities in Theorems 3.1, 3.2, and 3.3 are due to Liouville [79, 
pages 112-115], Fleck [40], and Hurwitz [65], respectively. Hurwitz’s observa- 
tions [65] on polynomial identities appeared in 1908. 

Hilbert [56] published his proof of Waring’s problem in 1909 in a paper ded- 
icated to the memory of Minkowski. The original proof was quickly simplified 
by several authors. The proof of Hilbert’s identity given in this book is due to 
Hausdorff [52], and the inductive argument that allows us to go from exponent k 
to exponent k + 1 is due to Stridsberg [120]. Oppenheim [94] contains an excellent 
account of the Hausdorff—Stridsberg proof of Hilbert’s theorem. Schmidt [105] 
introduced a convexity argument to prove Hilbert’s identity. This is the argument 
that Ellison [28] uses in his excellent survey paper on Waring’s problem. Dress [25] 
gives a different proof of the Hilbert-Waring theorem that involves a clever ap- 
plication of the easier Waring’s problem to avoid induction on the exponent k. 
Rieger [102] used Hilbert’s method to obtain explicit estimates for g(k). 


3.5 Exercises 


1. (Euler) Let [x] denote the integer part of x, and let 


Prove that 
g(k) > 2% +q —2. 


Hint: Consider the number N = q2* — 1. 


2. Verify the polynomial identity in Theorem 3.2, and obtain an explicit upper 
bound for g(6). 


3. Verify the polynomial identity in Theorem 3.3, and obtain an explicit upper 
bound for g(8). 


4, (Schur) Verify the polynomial identity 


22, 680(x? +.x3 +3 +23) 
=9 > ((2x;)'° + 180) (x; £x))'0 + ) (2x; xj $x)" 
+9 xe + >) + X3 + x4). 


5. Show that every integer of the form 22, 680a° is the sum of 2316 nonnegative 
integral 10th powers. 
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6. Let k, 2, and U be integers such that 0 < £ < k. Let 
f(x) =? +U)*". 


Show that there exist positive integers Ao, Ai, ..., Ae depending only on k 
and £ such that 
d** f 


e 
—— = A;x”! x7 +U)!, 


7. Letk > 1,T > 2, and D;, E; be integers fori = 0, 1,..., k — 1. Prove that 
there exist unique integers Co,..., Cx—; and J, such that 


0<C;<T fori=0,1,...,k—1 


and 


k-1 k—1 
Yi(Ce +D,)T! = > E,.T!+hT*. 
£=0 l=0 


8. This isan exercise in notation: Prove that )-(2k) = >-(k) but }°(k) # > 0(2k). 


4 
Weyl’s inequality 


The analytic method of Hardy and Littlewood (sometimes called the 
‘circle method’) was developed for the treatment of additive problems 
in the theory of numbers. These are problems which concern the rep- 
resentation of a large number as a sum of numbers of some specified 
type. The number of summands may be either fixed or unrestricted; in 
the latter case we speak of partition problems. The most famous ad- 
ditive problem is Waring’s Problem, where the specified numbers are 
kth powers .... The most important single tool for the investigation 
of Waring’s Problem, and indeed many other problems in the analytic 
theory of numbers, is Weyl’s inequality. 


H. Davenport [18] 


4.1 Tools 


The purpose of this chapter is to develop some analytical tools that will be needed 
to prove the Hardy—Littlewood asymptotic formula for Waring’s problem and other 
results in additive number theory. The most important of these tools are two in- 
equalities for exponential sums, Weyl’s inequality and Hua’s lemma. We shall also 
introduce partial summation, infinite products, and Euler products. 

We begin with the following simple result about approximating real numbers 
by rationals with small denominators. Recall that [x] denotes the integer part of 
the real number x and that {x} denotes the fractional part of x. 
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Theorem 4.1 (Dirichlet) Let a and QO be real numbers, Q > 1. There exist 
integers a and q such that 


1<q<Q, (a,q)=1, 


and 
1 
a—-—|< —. 
q| 4@Q 
Proof. Let N = [Q]. Suppose that {ga} € [0,1/(N + 1)) for some positive 
integer q < N.If a = [qa], then 


0 < {qa} = ga — [qa] = qa —a < 


N+1’ 
and so 
a 1 1 1 
a——|< —__ < — < -. 
q| @Q(N+1) qQ- q 


Similarly, if {ga} € [N/(N + 1), 1) for some positive integer g < WN and if 
a = [qa] +1, then 


< {qa}=qa-—a+1<1 


N+17— 
implies that 
—a| < 
igo ae a 
and so 
| a 1 1 1 
—— <—~<-—. 
q\- @(N+1) qQ7 @? 
If 
(ga) 1 N 
4S | N41’ N41 
for allg =1,..., N, then each of the N real numbers {qq} lies in one of the N — 1 
intervals 
7 i+ ] 
—— , ——— fori=1,...N—1. 
N+1N+1 


By Dirichlet’s box principle, there exist integers i € [1, N—1] and q, q2 € [1, N] 
such that 
l<qa<@<N 


and 
i i+] 
{qia}, {q2a} € Fest wa) 
Let 
q=ga-qQnell,N-1] 
and 


a = [qo] — [qia]. 


4.2 Difference operators 


Then 
1 
Iga — al = |(q2a — [qoa]) — (qia — [qia])| = l{goa} — {qia}] < Wal 


This completes the proof. 
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The forward difference operator Aq is the linear operator defined on functions f 


by the formula 
Aa(f)(x) = f(x +d) — f(x). 


For £ > 2, we define the iterated difference operator Aq,,d,_,,....d, DY 
A do sde— seed = Ag, Oo Aa, ee io Aa, re) Aa, O:-:-O Ag, . 


For example, 


Aaya, (f (x) = Aa, (Aa, (f)) x) 
= (Ag,(f)) (x + da) — (Aa, (f)) @) 
= f(x +d,+d\) — f(x +d) — f(x+d))+ f(x) 


and 


Nay,do,d,(f (x) = f(x +3 + dz +d;) — f(x + d3 + dz) 
—f(x+d3+d,;)— f(x+d,+d)) 
+f(x+d3)+ f(x+do)+ f(x+d)) — f(x). 


wong 


Then 
A? (f)(x) = f(x +2) — 2f(x +1) + f(x) 


and 


AO (F(x) = f(x +3) — 3f (x +2) 4+ 3 f(x +1) — F(x). 
Lemma 4.1 Let > 1. Then 


£ 2 
A? A\(x) = Yo(-D (‘)s (x + Jj). 
j=0 


Proof. This is by induction on £. If the lemma holds for @, then 


A“ F)(x) 
= A (A (f)) (x) 
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£ 
=A (eve (‘) f(x+ >) 
j-0 J 
£ WA) 
=e ( acne +j) 
j=0 / 


e £ 
= Y(-p Gi +jt+l)+ Yep (‘) fat) 
ja J j=0 J 
f+1 


[ 2 f wa 
j=l J— j=0 / 


£ 
= freely + (DO ((, f ) ' (‘)) fe + j)+(—I" f(a), 


j=l 


This completes the proof. 
We shall compute the polynomial obtained by applying an iterated difference 
operator to the power function f(x) = x*. 


yeesy 


: k! 


_ ji Jevj 
Ag,,....d,(x") = > Fe se dix! (4.1) 
Jptotigtjak J “JI err Je: 
J20, jj igz! 


= d,---depy_o(x), 


where px—¢(x) is a polynomial of degree k — & and leading coefficient k(k — 
1)---(kK —£+1). If dj,...,d) are integers, then px_¢(x) is a polynomial with 
integer coefficients. 


Proof. This is by induction on @. For 2 = 1, we have 


Aa (x*) = (x +i) — x! 


rf a,t 
jyrjek J° dle 


Let 1 < € < k — 1, and assume that formula (4.1) holds for 2. Then 


Adis dp,...,) (x*) 
= Nag (Adena; (X")) 


k! oo 
= ot . di‘ Mav, (x) 


Py7,t... 
iptetjgtmak m J} ° Je ° 
m20,j,,..j¢2! 
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k} ; ; m! . 
° Ji. Ade ° Jt+1 VJ 
: (dj dj y } rn doy x 


j\ teotjetimeak m a ! oes Je ° Jesytjnm J ! Je+ 
My jprerteZl 520, jg4) 2! 
k} ji Je Jen. j 
_ a wae +1 y5J 
= y } d; dj doy) x 


iV a7,t ee. Pol jp! 
jy teotigtmak de4ytj=m J -J1 ° Je: Je+1 ° 
M jp Jez J29, jeg, 2! 


k! 


iyeetietieg tick J i! oe Je! jer! 
G20, jf) Se dea) 2! 


Ai de den yi 
dit... dgeajen xi, 


Since the multinomial coefficients k!/j!j,!--- je! are integers, it follows that if 
d,,...,d, are integers, then the polynomial p,—¢(x) has integer coefficients. This 
completes the proof. 


Lemma 4.3 Let k > 2. Then 


di t+---+dk_ 
Aa, veasy a(x") = .. .ap_ 1k! (« + tae) . 


2 
Proof. This follows immediately from Lemma 4.2. 


Lemma 4.4 Let @ > 1 and Ag,d,_,.....d, De an iterated difference operator. Let 
f(x) =ax* +--- be a polynomial of degree k. Then 
Aay,...d (f(x) = dy +++ de (kk — 1) (k — 2+ Dax** +--+) 


geeeg 


if1<€<kand 
Ade de-1,....d (F(x) = 9 
if € > k. In particular, if € =k — 1 and d, ---dy_, #0, then 


Maya (FX) = dy ++ dk lax + B 
is a polynomial of degree one. 


Proof. Let f(x) = Dia a jx! , where a; = a. Since the difference operator A 
is linear, it follows that 


k! 
=d,---d —_—_____ k-€ 4, . 
ees (a -pl ) 
This completes the proof. 


Lemma 4.5 Let1<€<k. If 
—P<d,...,d¢,x <P, 


then 


where the implied constant depends only on k. 
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Proof. It follows from Lemma 4.2 that 


k 3 k! oo 
|Ad, geeey d(x )| < 27 GET a 
sie adhe? Je 
J20,j,.--Je2! 

| 
__* ops 

Jp teotigg jk Bn! oe Je! 
Jef posdg20 


(+ 1)‘ P* 
< (k+1)' P* 
« PF. 


lA 


This completes the proof. 


4.3 Easier Waring’s problem 


Here is a simple application of difference operators. 

Waring’s problem states that every nonnegative integer can be written as the 
sum of a bounded number of nonnegative kth powers. We can ask the following 
similar question: Is it true that every integer can be written as the sum or difference 
of a bounded number of kth powers? If the answer is “yes,” then for every k there 
exists a smallest integer v(k) such that the equation 


n=txftx3---txhy (4.2) 


has a solution in integers for every integer n. This is called the easier Waring’s 
problem, and it is, indeed, much easier to prove the existence of v(k) than to prove 
the existence of g(k). It is still an unsolved problem, however, to determine the 
exact value of u(k) for any k > 3. 


Theorem 4.2 (Easier Waring’s problem) Let k > 2. Then v(k) exists, and 
k! 
v(k) < 2*-1 4+ > 


Proof. Applying the (k — 1)-st forward difference operator to the polynomial 
f (x) = x*, we obtain from Lemma 4.1 and Lemma 4.3 that 


k—1 

k-1 

A& DG) =kix +m = yee ; Jo +2), 
L=0 


where m = (k — 1)! (). In this way, every integer of the form k!x +m can be written 
as the sum or difference of at most 


k—1 
> - : anit 


f=0 
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kth powers of integers. For any integer n, we can choose integers gq and r such that 
n—-m=kiqg+tr, 


where 


Since r is the sum or difference of exactly |r| kth powers 1*, it follows that n can be 
written as the sum of at most 2'~! +k!/2 integers of the form +x*. This completes 
the proof. 


4.4 Fractional parts 
Let [a] denote the integer part of the real number a and let {a} denote the fractional 
part of a. Then [a] € Z, {a} € [0, 1), and 
a = [a] + {a}. 
The distance from the real number a to the nearest integer is denoted 
lla || = min (jn — a| : n € Z) = inf({a}, 1 — {a}). 


Then ||a|| € [0, 1/2], and 
a=n ||| 


for some integer n. It follows that 
| sin 7a| = sin 7 }a|| 
for all real numbers a. The triangle inequality 


la + Bll < llall + 1B (4.3) 


holds for all real numbers @ and B (see Exercise 2). 

The following two very simple lemmas are at the core of Weyl’s inequality for 
exponential sums, and Weyl’s inequality, in turn, is at the core of our application 
of the circle method to Waring’s problem. Recall that exp(t) = e’ and e(t) = 
exp(2mit) = e27!". 


Lemma 4.6 [f0 <a < 1/2, then 
2a < sinwa < 71a. 


Proof. Let s(@) = sina — 2a. Then s(O) = s(1/2) = 0. If s(w) = 0 for some 
a € (0, 1/2), then s’(@) = 2 cos 7a — 2 would have at least two zeros in (0, 1/2), 
which is impossible because s’(a@) decreases monotonically from 2 — 2 to —2 in 
this interval. Since (1/4) = (/2 — 1)/2 > 0, it follows that s(w) > 0 for all 
a € (0, 2/2). This gives the lower bound. The proof of the upper bound is similar. 
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Lemma 4.7 For every real number a and all integers N; < No, 


N2 
>) ean) < min(N2 — M, |larl|~"). 


n=N\+1 


Proof. Since |e(an)| = 1 for all integers n, we have 


N2 N» 
> e(an)| < > 1 = N2—WN,. 
n=N\+1 n=N,+1 


Ifa ¢ Z, then ||a|| > Oand e(a) 1. Since the sum is also a geometric progression, 
we have 


N2 


> e(an) 


n=N),+1 


N2—-N,-1 


e(a(N; + 1)) > e(a)” 


n=() 
_ je(@(W2 — Ni)) — 1 
e(a) — 1 

2 
< ee 

le(a@) — 1 
_ 2 

le(a/2) — e(—a/2)| 
_ 2 
7 \2i sin 7a| 
_ 1 
7 | sin 7a| 
_ 1 

sin(z |||) 

] 

< ——. 

2||a|| 


This completes the proof. 


Lemma 4.8 Let a be a real number, and let q and a be integers such that q > 1 
and (a, q) = 1. If 


then 


1 
>, — «Kgqlogg. 
1<raq/2 ler 


Proof. The lemma holds for g = 1, 


1 
= 0. 


1<peq/2 Her'l 
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Therefore, we can assume that q > 2. For each integer r, there exist integers 
s(r) € [0, g/2] and m(r) such that 
=+ (< _ m(r)) 
q 


s(r) _ 
Since (a, q) = 1, it follows that s(r) = 0 if and only ifr = 0 (mod q), and so 
s(r) € (1, q/2] ifr € [1,q/2]. Let 


ar 


q 


r 
q 


where —1 < @ < 1. Then 


where 


It follows from (4.3) that 


ar @' 
2q 
sir) @ 


= jm(r) —_— + — 
q  2q 


lar || = 


sir). @ 

a 
q  2q 

sr) | | 
q 

. s(r) 1 


Q’ 
2q 


Let 1 <7, <1r2 < q/2. We shall show that s(r;) = s(r2) if and only if r; = rp. If 


ar, _ ar2 
q q 
then 
ar, ar2 
+ (“2 _ m(ri)) =o (° —_ mr) 
q q 
and so 


ar, =x+tar, (mod q). 
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Since (a,g) =1land1 <r, <7 < q /2, we have 
n= +r, (mod q) 


and so 


It follows that 


| 


Therefore, 


ar 


q 


l<s<q/2 
< q loggq. 
This completes the proof. 


Lemma 4.9 Let a be a real number. If 
g-f)<t 

q\~ 4q?’ 
where q = 1 and (a,q) = 1, then for any nonnegative real number V and 
nonnegative integer h, we have 


f ] 
min | V, ————— ]} < V+gqloggq. 
2 ( la(hg =i) reed 


Proof. Let 
a8 
a=—+—, 
q 
where 
—1<48#<] 
Then 
6h 6 
a(hg +r) =ah+— + — 4 — 
q q 4q 
6h) +{0h 6 
=ah+ 4 PALER) | Or 
q q q 


ar + [0h] + d(r) 
q >] 


=ah+ 
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where 9 
~1 <6(r) = {6h} + — <2. 
q 
For eachr = 1,...,q there is a unique integer r’ such that 
+ [0h] +6 
q 
Let , 
O<t<1--. 
q 
If | 
t<{a(hq+r)}<tt+—, 
q 
then 
gt <ar—qr'+[0h])+4(r) < qt +1. 
This implies that 
ar — qr’ < qt — [0h] +1—<8(r) < qt — [6h] +2 
and 


ar — qr’ > qt — [6h] — d(r) > qt — [0h] —2. 
Thus, ar — qr’ lies in the half-open interval J of length 4, where 
J =(qt — [0h] — 2, qt — [0h] +2]. 
This interval contains exactly four distinct integers. If 1 <r, <r, < q and 
ar, — qr; =ar2 — qr, 
then 
ar; =ar2 (mod @). 


Since (a, g) = 1, we have 
r} =r. (mod gq) 
and so 
Yr, =72. 
It follows that for any t € [0, (¢ — 1)/q], there are at most four integers r € [1, q] 
such that 
{a(hq +r)} € [t,t +(1/q)). 


We observe that 
lla(hq +r)\| € [t,t +(1/q)] 


if and only if either 
{a(hq +r)} € [t,t +(1/q)] 


108 4. Weyl’s inequality 


or 
1 — {a(hq +r)} € [t,t +(1/q)]. 
The latter inclusion is equivalent to 


{a(hq +r)} € [t’, t +(1/q)], 


where | | 
O0<r=1-——--t<1--. 
q q 
It follows that for any ¢ € [0, (¢ — 1)/q], there are at most eight integers r € [1, qg] 


for which 
la(hq +r)|| € [t,t +(1/q)]. 
In particular, if we let J(s) = [s/q, (s + 1)/q] for s =0,1,..., then 


lla(hq +r)|| € J(s) 


for at most eight r € [1, q]. 
We apply this fact to estimate the sum 


F min(v, 
lla(hq +r)|l 


l<r<q 


If |la(hq +r)|| € J(O) = [0, 1/q], then we use the inequality 


1 
min (v. ae cai) <V. 
la(hg +7) 


If |la(hq +r)|| € J(s) for some s > 1, then we use the inequality 


1 
min (v. aera) < —————__ < qe 
la(hq +7r)]| lathq+r)|| ~ s 
Since ||a(hq +r)|| € J(s) for some s < q/2, it follows that 


. I q 
Yn (Yjatgrmi) 97°82 


l<r<q I<s<q/2 * 
<« V+qilogg. 
This completes the proof. 
Lemma 4.10 Leta be a real number. If 
y—-f4{<4 
q\- q 


where g > 1 and (a, q) = 1, then for any real number U > | and positive integer 


n we have 
_ fn ] n 
> min (7 etl) <K ( + u +a) log 2qU. 
Le, NK lok q 


4.4 Fractional parts 


Proof. We can write k in the form 


k=hq tr, 
where 
l<r<q 
and y 
O<h< —. 
q 
Then 


n ] 
S = min (7 a) 
2m eal 


n l 

< min {| ———- , ——————__ }.. 
oo, » (i, +r’ lja(hg + >i) 
Ifh =Oand 1 <r < q/2, then Lemma 4.8 gives 
_[n | 1 
> min (* <a) s > —— < qlogq. 
l<r<q/2 r ller|| 1<r<q/2 [|r || 
For the remaining terms, we have 


1 2 
ee 
hq+r (h + 1)q 


since either h > 1 and 


h+1 
hg+r > hg > O* 
orh =0,q/2 <r <q, and 
h+1 
hqtr=r> iu! Mi )q- 
2 2 


Therefore, 


n 1 
S<qlogq+ min (as wee). 
oot » (h+1)q |la(hq +r)|l 
Note that y 
—+1<U+q <2max(, VU) < 2qU. 
q 


Estimating the inner sum by Lemma 4.9 with V = n/(h + 1)q, we obtain 


n 1 
S<qlogq+ min (Gig wae) 
yoo, X (h+1)q |la(hq +r)\\ 
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(4.4) 
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n 
<K qlogg + > he hq * 7°84 


0<h<U/q 
n 1 U 
«K qlogg+-— > pit (21) ale8¢ 
o<h<u/g * q 


U 
«K glogg+ ”" log (< + 1) +U logg +q logg 
q q 


<< (“ +U +4) log 2qU. 
q 


This completes the proof. 


Lemma 4.11 Leta be a real number. If 


1 


< — 
q 


a 
Ae- 
q 


? 


where q = 1 and (a, q) = 1, then for any real numbers U and n we have 


1 Un 
> min (x a) < (< +U+n+ =) max{1, log gq}. 
l|ak | q 


1<k<U 


Proof. This is almost exactly the same as the proof of Lemma 4.10. We have 
S= > min ¢ a) 
” |lork|| 


1<k<U 
("jeg sri) 

> min { n, ————— 

lla(hg +r)| 


O0<h<U/q \<r<q 


glogqg+ >~ (n+ 3 ‘) 


O<h<U/q i<s<q/2* 


lA 


[A 


Kqlogqg+ Y. (n+qlogq) 
0<h<U/q 


U 
« qlogqg + (“ + 1) (n + q logq) 
U 
« glogg + U logg +nt+— 
q 


Un 
< (« +U+n+ =) max{1, log q}. 
q 


This completes the proof. 
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4.5 Weyl’s inequality and Hua’s lemma 
In this section, we denote by [M, N] the interval of integers m such that M<m< 
N. For any real number t, the complex conjugate of e(t) = e?""' is e(t) = e(—t). 


Lemma 4.12 Let N,, N2, and N be integers such that N, < Nz and0 < Nz — 
N, < N. Let f(n) be a real-valued arithmetic function, and let 


N) 
Sf)= Yo e(f()). 
n=N,+1 
Then 
ISA)? = > Saf), 
ld|<N 
where 


Sa(f)= > e(Aa(f)(n)) 


néel(d) 


and I(d) is an interval of consecutive integers contained in [N, + 1, N2]. 
Proof. For any integer d, let 
I(d) = [N, +1 —d, Nz —d]N[N, +1, No]. 
Squaring the absolute value of the exponential sum, we get 


ISA)? = S(AS(F) 


Np N> 
= >> efim) D> e(f@)) 
m=N,+1 n=N,+1 
N> N2 


> dS ef) - fm) 


n=N,+1 m=N,+1 

N2 N2—-n 

Y> dO e(f(nt+d)— fm) 
n=N,+1 d=N,+1—n 

N> N2—n 

Yo do eAaAm) 
n=N\+1 d=N,+1—n 

N2—-N,-1 


> e(Aa(f)(n)) 


d=—(N)—N,—1) nél(d) 


=>) Dd) e(Aa(fy(n)) 


ld|<N nel (d) 


= > Silf). 


ld|<N 


This completes the proof. 
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Lemma 4.13 Let N;, No, N, and £ be integers such that 2 > 1, N, < No, and 
O< N2—N < N. Let f(n) be a real-valued arithmetic function, and let 


N2 
S(f)= D> ef). 
n=N,+1 
Then 
ISA? < QNYP "YO Sa af): 
|d\|<N |de|<N 
where 
Sa.a(f)= >, e(Ay....a(f)M)) (4.5) 
né€lI(dz,...,d1) 


and I(d;, ..., d,) is an interval of consecutive integers contained in [N, +1, No]. 


Proof. This is by induction on @. The case £2 = 1 is Lemma 4.12. Now assume 
that the result is true for 2 > 1. Using the Cauchy—Schwarz inequality, we obtain 


scp" = (Iscnr’) 


2 
< (amt Yo YO Ste. 7) 


ldi|<N lde|<N 


2 
- anya ( + SO Sa... 7) 


ld\|<N ldel<N 


f+] 4p _ 
< (2NyP"~*2N) SO eS Sana PP? 
ld; |<N lde|<N 


where Sq,.....d,(f) 18 an exponential sum of the form (4.5). By Lemma 4.12, for 
each d,,..., dz, there is an interval 


I (desi, de, ...,d,) © (de, ...,d)) © [M1 +1, No] 


such that 
2 
Sa..a(AP =| So e(Aa.u..a (fn) 
nel (d,,...,d¢) 
- > Se (Adar sdesunds (£)(0)) 
Iders|<N nel (des) ,d¢,...,d1) 
= > Sdest.di pees a, (Ff), 
Idest|<N 
and so 


scp) < Quy -@D-1 > a > > Salpes.descccdh (f). 


|\d\|<N Idel<N |desi|<N 


This completes the proof. 
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Lemma 4.14 Letk > 1, K = 2‘"!, andeé > 0. Let f(x) = ax* +--- bea 
polynomial of degree k with real coefficients. If 


N 
S(f) => e(f@), 
n=} 


then 
kiN‘! 


IS(AIK « NK 4+.NK-* ¥” min (N, |lmol|7"), 


m=] 


where the implied constant depends on k and «. 


Proof. Applying Lemma 4.13 with £ = k — 1, we obtain 
ISCAIK <(2N)E* YO ee SO Saas PL 


\d\|<N ldh-s|<N 


where 
Sarr f= > e(Aar.uai(f))) 
nél (dy_1,...,d,) 
and J(d,_;,..., d;) is an interval of integers contained in [1, N]. Since |e(t)| = 1 


for all real t, we have the upper bound 


Sarno lS Yo le(Adgrua(fm)) |< N. 


nél(d-| yeep) 


By Lemma 4.4, for any nonzero integers d;,...,d,_1, the difference operator 
d, applied to the polynomial f(x) of degree k produces the linear polyno- 


geees 


Nay de FX) = Qe_-1 ++ dyk lax +B =Ax + B, 


where 
i= dy—1 - -dikia 


and B € R. Let I(dy_},..., d;) = [Ni +1, No]. By Lemma 4.7, 


|Sa,_1,...4) (F)| 


e (Mags dp-aseend (f)(n)) 


nél (d-},..-,d)) 
N> 


> e(an +B) 


n=N,+1 


© Wide-1 +» dy klar) 
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It follows that 

Satnai(F)| S min(N, |Idy «++ dy—rklor||7*). 
Therefore, 


ISCAIIK <QN)E* SO Sa AD 


|d\|<N ldk_1|<N 


<(2N)K* SU. $2 min(N, [Idi +++ dik !er||7"). 


Idi|<N ldh-i|<N 


Since there are fewer than (k — 1)(2N)*~* choices of d,,...,d,—, such that 
d; ---d,_, =0, and each such choice contributes N to the sum, it follows that 


IS(F)IX << (2N)*~*(K — 1)(2N)*?7N 


+(2N)K“E YT ST min(N, Ida «+= deste 7") 
1<|d\|<N 1<|dk_1|<N 


< k(N)*~ 
+2'-I1NK-K SN”... SY min(N, Id «= de_aktoe||71) 
l<d,;<N 1<aQ_}<N 
N N 
«NET 4NKKYS"... S” min(N, Idi ++ dy1kloe||~) , 
d\=1 dy—\=1 


where the implied constant depends only on k. Since 
1 <dy-+-dy_yk! < kN‘ 


and the divisor function t(m) satisfies t(m) <«<, m* for every € > 0, it follows 
that the number of representations of an integer m in the form d, ---d,_,k! is 
< m* < N°. Therefore, 


N N 
ISAK K NET 4 NEKYS ... S© min(N, Ilde-1 >» dyk!ol|71) 
d\=1 dy_\=1 
kink! 
« NE-1 4 NE-H Y” min (N, Ila"), 
m=] 


where the implied constant depends on k and ¢. This completes the proof. 


Theorem 4.3 (Weyl’s inequality) Let f(x) = ax*+.-- be a polynomial of degree 
k > 2 with real coefficients, and suppose that a has the rational approximation 
a/q such that 


1 
< 
~ 2’ 
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where q => 1 and (a, q) = 1. Let 
N 
S(f) = > e(f(n)). 
n=] 


Let K =2*"' and € > 0. Then 
S(f) <x N!}té (No +q! +N~*q)'/* 
where the implied constant depends on k and €. 


Proof. Since |S(f)| < N, the result is immediate if g > N*. Thus, we can 
assume that | 
l<q<N', 


and so 
logg <logN «N°. 


By Lemma 4.14, we have 


kink} 
IS(A)IK «< NK-1 +N SY” min(N, |lmo||7'). 
m=) 
By Lemma 4.11, we have 
kink! k 
y— min (N, ||mal|7!) « (< +kINe1+N4— ) max{1, log gq} 
m=1 
NE 
<K (< + Ne 14 =| log N 
q 
< N* (qN*+N7!4+q7')NE. 
Therefore, 


IS(f)|* < N*7-! + N+ (qNn*+N7! +q') 
< NAtE (qn * +N7! +q7') 
This completes the proof. 


Theorem 4.4 Let k > 2, and let a/q be a rational number with q > 1 and 
(a,q) =1. Then 


q 
S(q,a) =) e(ax*/q) «K qh /K*, 


x=] 
Proof. Apply Weyl’s inequality with f(x) = ax*/q and N = q. We obtain 
S(q, a) Kx qi**(q7} tq Rye «K gi KE 


This completes the proof. 
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Theorem 4.5 Letk > 2. There exists 5 > 0 with the following property: If N > 2 
and a/q is a rational number such that (a, q) = 1 and 


ni/2 <q< NEW? 
then 

N 

) | e(an*/q) « NI, 

n=1 


Proof. Applying Weyl’s inequality with f(x) = ax*/q, we obtain 


S(f) « Nite (N7 +q7 +N7*g)/" 


< N!te (N7 + N71/2 + N12) 178 
< N!-1/2K+e 
< n}-8 
for any 6 < 1/2K. This completes the proof. 
Theorem 4.6 (Hua’s lemma) Fork > 2, let 
N 
T(a) = > e(an*), 
n=] 
Then ; 
[ IT(a)|? da «< N2—*°, 
0 
Proof. We shall prove by induction on j that 
1 ; — 
/ |T(a)|* da « N* It 
0 
for j7 =1,...,k. The case j = 1 is clear since 
1 N Nol 
| IT(@)?da= >>> | e(a(m* — n*))da = N. 
0 m=] n=] 40 


Let 1 < j < k — 1, and assume that the result holds for j. Let f(x) = ax*. By 
Lemma 4.2, 

Ad;,....d)(f (x) = ad; +++ dy pe_j(x), 
where px— ;(x) is a polynomial of degree k — j with integer coefficients. Applying 
Lemma 4.13 with N; = 0, Nz = N, and S(f) = T(a), we obtain 


IT@) <QnyrF SY YS ee (Aa, A) 
ld\|<N ldj|<N nel (dj,...,d)) 


=(2NY 1 Se SS Se (ad; --- di pe_j(n)), 
|\d\|<N ldj|<N nel(dj,...,d)) 
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where J(d;,..., d;) is an interval of consecutive integers contained in [1, NJ. It 
follows that | _ 
Ta)" < N* I~" Y 'r(de(ad), (4.6) 
d 


where r(d) is the number of factorizations of d in the form 
d = d;---dy py_j(n) 

with |d;| < N andn € I(d;,..., d)). Since d < N* by Lemma 4.5, we have 
r(d)< |d|’« N’ 


for d #0. Since p,_ (x) is a polynomial of degree k — j > 1, there are at most 
k — j integers x such that p,_; = 0, and so 


r(0) « N’. 
Similarly, since 


ra Te ay 
(oe ye 
- xy $ (o(Sa $n) 


x;=1 1 yj=1 yj-1=1 


-)> s(d)el ad), 
d 


IT(a)|" = 


where s(d) is the number of representations of d in the form 


j-1 


iI 
d= yf —~) xt, 


i=] i=] 


with 1 < x;, y, < N fori =1,..., 7 — 1. Then 


y*s(d) = |T(O)|? = N” 
d 
and, by the induction hypothesis, 
1 a 
s(0) = | \T(a)|* da « N”~J*, 
0 
It follows from (4.6) that 


1 1 
| T(a)|" da = | IT(a)|” |T(@)” da 
0 0 
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. 1 
< N*-J7! / Yo r(@e(ad') Y~ s(d)e(—ad)dar 
0 d 


d' 


= N2’-J-1 ) \r(d)s(d) 
d 


= N”~J-17(0)s(0) + N2’-J7! ) -r(d)s(d) 
df 
2/—j—-1 yz j n72/—jte 2/—j—1yze 
<N NIN +N N y° s(d) 
dA 
x N22 -Gtbte +4 N2’-J-! Né N~ 
K N27 -Gtbte | 


This completes the proof. 


4.6 Notes 


The material in this chapter is well-known. For the original proofs of Weyl’s 
inequality and Hua’s lemma, see Wey] [141] and Hua [62], respectively. Daven- 
port [18],Schmidt [106], and Vaughan [125] are standard and excellent introduc- 
tions to the circle method in additive number theory. 

The easier Waring’s problem was introduced by Wright [150]. 


4.7 Exercises 
1. Prove that 
lx || = || — x]| = ln + x] 


for all x € R andn e Z. Let (x) denote the fractional part of x. Graph 
F(x) = (x) + I|x|| forO <x <1. 


2. Prove that 
la + Bll < |la|| + |B 


for alla, B ER. 


3. Let £ > 1, and let A; denote the iterated difference operator A, ;;. Prove 
that 
e [0 
Ad f(x) = Y(-D ( ) F0 +j). 
j=0 J 
4. Let Ag,.a, be an iterated difference operator. Find a general formula to 


express Ag,,._.a,(f)(x). 
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5. Let £2 > 2, let o be a permutation of {1,2,..., 2}, and let Ay,,..¢, be an 
iterated difference operator. Prove that 


Nag endo = Aden: 


5 


The Hardy—Littlewood asymptotic 
formula 


. using essentially the same techniques as Hardy and Littlewood’s 
but in a different way and introducing certain additional considera- 
tions, we shall derive the same result with incomparable brevity and 
simplicity. 


I. M. Vinogradov [131] 


5.1 The circle method 


For any positive integers k and s, let 7;,,,(V) denote the number of representations 
of N as the sum of s positive kth powers, that is, the number of s-tuples (x1, ..., xs) 
of positive integers such that 


Waring’s problem is to prove that every nonnegative integer is the sum of a bounded 
number of kth powers. Since 1 = 1* is a kth power, this is equivalent to showing 
that 


rp. s(N) > 0 


for some s and for all sufficiently large integers N. Hilbert gave the first proof of 
Waring’s problem in 1909. Ten years later, Hardy and Littlewood succeeded in 
finding a beautiful asymptotic formula for 7;,,(N). They proved that for s > so(k), 
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there exists 6 = 6(s, k) > 0 such that 
1\5 -1 
r.s(N) = G(N)P (1 + :) r (=) N&/D=1 4 O(NG/H=1-8), (5.1) 


where I(x) is the Gamma function and G(N) is the “singular series,” an arith- 
metic function that is uniformly bounded above and below by positive constants 
depending only on k and s. We shall prove that the asymptotic formula (5.1) holds 
for so(k) = 2* +1. 

Hardy and Littlewood used the “circle method” to obtain their result. The idea 
at the heart of the circle method is simple. Let A be any set of nonnegative integers. 
The generating function for A is 

f(z) = > Zz, 


acA 


We can consider f(z) either as a formal power series in z or as the Taylor series 
of an analytic function that converges in the open unit disc |z| < 1. In both cases, 


f(z => ora s(N)2%, 
N=0 


where r4_;(N) is the number of representations of N as the sum of s elements of 
A, that is, the number of solutions of the equation 


N =a, +a, +-°--°-+4s; 
with 
Q1,€A2,°°:,a; EA. 


By Cauchy’s theorem, we can recover r,_ ;(N) by integration: 


1 f(z)” 
(N) = — d 
TA, ( ) 2 i Iz|=p zNtl 


for any ¢ € (0, 1). 

This is the original form of the “circle method” introduced by Hardy, Littlewood, 
and Ramanujan in 1918—20. They evaluated the integral by dividing the circle of 
integration into two disjoint sets, the “major arcs” and the “minor arcs.” In the 
classical applications to Waring’s problem, the integral over the minor arcs is 
negligible, and the integral over the major arcs provides the main term in the 
estimate for r4 ,(V). 

Vinogradov greatly simplified and improved the circle method. He observed 
that in order to study r4.,(N), it is possible to replace the power series f(z) with 


the polynomial 
p(z)= > 2". 


aéA 
a<N 
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Then 
sN 


pz’ = drys m)z”, 


m=0 


where ro (m) is the number of representations of m as the sum of s elements of A 


not exceeding N. In particular, since the elements of A are nonnegative, we have 
rm) = r4,s(m) for m < N and rm) = 0 for m > sN. If we let 


ni 
z = e(a) =e io 


then we obtain the trigonometric polynomial 


F(a) = p(e(@)) = >> e(aar) 


acA 
a<N 


and 
sN 


F(a) = > ro (m)e(ma). 


m=0 


From the basic orthogonality relation for the functions e(nq@), 


1 ifm=n 


1 
[ etmaye(—nayaa = | 0 ifmn, 


we obtain 


1 
rast) = | F(a)’'e(—Na)da. 
0 


In applications, of course, the hard part is to estimate the integral. 

To apply the circle method to Waring’s problem, let k > 2 and A be the set of 
positive kth powers. Let r; ,(N) denote the number of representations of N as the 
sum of s positive kth powers. Let 


P=[N'/*]. 
Then 
P 
F(a) = > e(aa) = > e(an*) 
aca n=] 
and 


1 
rN) = | F(a)’ e(—aN)da. 
0 
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5.2. Waring’s problem for k = 1 


For k = 1, there is an explicit formula for 7; ,(V). 
Theorem 5.1 Lets > 1. Then 


N—1\_ N57! 
s-1}) (s—1)! 


rs(N) = ( +O (N*~’) 


for all positive integers N. 
Proof. Let N > s. We observe that 
N =a, +-+:+4s 
is a decomposition of N into s positive parts if and only if 
N —s =(a; —1)+---+(a, -—1) 
is a decomposition of N into s nonnegative parts. Therefore, 
ris(N) = Ri 5(N — 5), 


where R,,;(N) denotes the number of representations of N as the sum of s non- 
negative integers. 

We shall give two proofs of the theorem. The first is combinatorial. We begin 
by computing R,,,(V) for every nonnegative integer V. Let N = a, +---+a, be 
a partition into nonnegative integers. Imagine a row of N +s — 1 boxes. We color 
the first a; boxes red, the next box blue, the next az boxes red, the next box blue, 
and so on. There will be exactly s — 1 blue boxes. Conversely, if we choose s — 1 
of the N +s — 1 boxes and color them blue, and if we color the remaining N boxes 
red, then we have a partition of N into s nonnegative parts as follows. Let a; be the 
number of red boxes before the first blue box, a2 the number of red boxes between 
the first and second blue boxes, and, in general, for 7 = 2,...,s5 — 1, let a; be 
the number of red boxes that are between the (j — 1)-st and jth blue boxes. Let 
a, be the number of red boxes that come after the last blue box. This establishes a 
one-to-one correspondence between the subsets of size s — 1 of the N +s — 1 boxes 
and the representations of N as the sum of s nonnegative integers. Therefore, the 
number of decompositions of N into s nonnegative parts is the binomial coefficient 
(“*8~"), It follows that 


s—l 


s—l 


N-1 
ris(N) = Ris(N — Ss) = ( ) 


This gives the first proof of the theorem. 
There is also a simple analytic proof. The series 


oS ] 
f@)= D2" = —— 
N=  —  & 
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converges for |z| < 1, and 


F(z) = Yo Ris(N)z™. 
N=0 


We also have 
fe =— 
Z = 
(1 — z) 
_ 1 ds—! 1 
~ (s —D! dz! \1-z 
1 do (2 ‘ 
= qt qosot < 
(s — 1)! dzs-! (3 
_ 5 N(N — 1)---(N—S+2) y_sat 
No (s — 1)! 
_ > ( N jen" 
N=s—-1 s—1 
_ 5: Nt+ts— ‘) N 
na \ Sai 
Therefore, 
N+s-—1 
R, (NV) = ( _] ) 


This completes the proof. 


5.3. The Hardy—Littlewood decomposition 


For k > 2 there is no easy way to compute-or even to estimate—r;,,(N) for large 
N. It was a great achievement of Hardy and Littlewood to obtain an asymptotic 
formula for r,.,(N) for all k > 2 and s > so(k). In this chapter, we shall prove the 
Hardy-—Littlewood asymptotic formula for s > 2* +1. For N > 2* let 


Pp =[N'/*] (5.2) 

and > 
F(a) = > e(am*). (5.3) 

m=) 


The trigonometric polynomial F(a) is the generating function for representing N 
as the sum of kth powers. The basis of the circle method is the simple formula 


1 
na(N)= | F(a) e(—Na)da. (5.4) 
0 
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We cannot compute this integral explicitly in terms of elementary functions. By 
carefully estimating the integral, however, we shall derive the Hardy—Littlewood 
asymptotic formula. 

The first step is to decompose the unit interval [0, 1] into two disjoint sets, called 
the major arcs Sit and the minor arcs m, and to evaluate the integral separately 
over both sets. The major arcs will consist of all real numbers @ ¢€ [0, 1] that can, 
in a certain sense, be “well approximated” by rational numbers, and the minor arcs 
consist of the numbers a ¢€ [0, 1] that cannot be well approximated. Although most 
of the mass of the unit interval lies in the minor arcs, it will follow from Weyl’s 
inequality and Hua’s lemma that the integral of f(@)*°e(—Nqa) over the minor arcs 
is negligible. The integral over the major arcs will factor into the product of two 
terms: the “singular integral” J(N) and the “singular series” G(NV). The singular 
integral will be evaluated in terms of the Gamma function, and the singular series 
will be estimated by elementary number theory. 

The major and minor arcs are constructed as follows. Let N > 2*. Then P = 
[N!/*] > 2. Choose 


O<v<1/5. 
For 

l<q<P’, 

O<a<q, 
and 

(a,q)=1, 
we let 

M(q,a)= {« € [0, 1]: a ‘| < aa 

and 


q 
m= |) LU mq,a). 


I<q<P’ a0 
(a,q)=1 


The interval SJt(qg, a) is called a major arc, and Wt is the set of all major arcs. We 
see that 


l 
M1, 0) = fo, | ) 


] 
yt, 1) = [ — pk? i) 
and |! | 
a a 
IN(q, a) = F — aa ps 


for g > 2. The major arcs consist of all real numbers a ¢€ [0, 1] that are well 
approximated by rationals in the sense that they are close, within distance P”~*, 
to a rational number with denominator no greater than P’. 
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Ifa € Mg, a)N Mt(q’, a’) and a/q #a’/q’, then |ag’ — a’q| > 1 and 


1 1 
< 
P2” ~ gq! 
a a 
<|--— 
q @q 
a q’ 
<la-- + |a@ — — 
q q 
2 
< 
— pk-v? 


which is impossible for P > 2 and k > 2. Therefore, the major arcs St(q, a) are 
pairwise disjoint. 

The measure of the set Jt(1, 0) U M1, 1) is 2P’—*, and, for every g > 2 and 
(a, q) = 1, the measure of the major arc INt(q, a) is 2P”~*. For every g > 2 there 
are exactly y(q) positive integers a such that 1 < a < q and (q, a) = 1. It follows 
that the measure of the set 9Jt of major arcs is 


mM) = » (0) < se » 4 


l<qsP’ I<q<P’ 
2 P’(P”’ +1) 2 
= Pk-y ot” = pk-3v’ (5.5) 
which goes to zero as P goes to infinity. 
The set 
m = [0, 1] \ Dt 


is called the set of minor arcs. This set is a finite union of open intervals and 
consists of all a € [0, 1] that are not well approximated by rationals. The measure 
of the set of minor arcs is 


Even though the measure of the set m is large in the sense that it tends to 1 as P 
tends to infinity, we shall prove in the next section that the integral over the minor 
arcs contributes only a negligible amount to r; ,(NV). 


5.4 The minor arcs 


We shall now show that the integral over the minor arcs is small. 


Theorem 5.2 Letk > 2 ands > 2‘ +1. There exists 5, > 0 such that 
| F(a)’e(—Na)da = O(P**~*), 
m 


where the implied constant depends only on k and s. 
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Proof. By Dirichlet’s theorem (Theorem 4.1) with Q = P*-”, to every real 
number a@ there corresponds a fraction a/q such that 
l<q<P*", (a,q)=1, 


and 


c l < mM; 1 1 
in rae — e 
Ifa em, thena ¢ IN(1, 0) UMC(1, 1), so 


1 
pkow <a<l-— pha 
andl <a<q-—1.Ifq < P’, then 
a 1 
a--|< — 
q — ~pk-v 


implies that 
a € Mg, a) C M=[0, 1] \ m, 


which is absurd. Therefore, 
P’ <q < PX’, 


Let 
K =2*-1, (5.6) 


It follows from Weyl’s inequality (Theorem 4.3) with f(x) = ax* that 
F(a) < pit (Po +q! + p-kg)/* 
«K pit (Po + Ps p-k pk-vyi/& 
<K Pite—v/K 


Applying Hua’s lemma (Theorem 4.6), we obtain 


| [ F(a)'e(—na)da| = | [ F(a)’~* F(a)” e(—na)da 
m m 


< / F(a) [F(a de 
m 


lA 


1 
max | F(a) |5~2 i [F(a)|? da 
aem 0 


& (2a ae prt-kte 
= Ps-*k-4 
where 
_ vs — 2*) 
OK 
if € > 0 is chosen sufficiently small. This completes the proof. 


5; —(s—2 +le>0 
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5.5. The major arcs 


We introduce the auxiliary functions 
1 1/k-1 
u(B) = se po. oe 


and 


q 
S(q,a) = | e(ar*/q). 
r=] 


We shall prove that if a lies in the major arc t(q, a), then F(q) is the product of 
S(q, a)/q and v(a@ — a/q), plus a small error term. We begin by estimating these 
functions. 

Clearly, |S(q, a)| < g. By Weyl’s inequality (Theorem 4.4), we have 


S(q, a) <x ge 


and ‘ 
# a) K qu iKre (5.7) 


where the implied constant depends only on ¢. 


Lemma 5.1 /f|B| < 1/2, then 
v(B) « min(P, |B|7'/*). 


Proof. The function 
f(x) = px 


iS positive, continuous, and decreasing for x > 1. By Lemma A.2, it follows that 


Lak 
|v(B)| < Dogme 


m=] 


N 
< | koa daa f (1) 


1 
< Ni/E 


< P. 


If |B| < 1/N, then P < N'/* < |B|~'/* and v(B) « min(P, |B|7'/*). 
Suppose that 1/N < |B| < 1/2. Then |B|7'/* < P. Let M =[|8|~']. Then 


1 
M < 


<—<M+H+I1<N. 
B 
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Let U(t) = aes e(Bm). By Lemma 4.7, we have U(t) < ||B\|~! = |B\7!. By 
partial summation (Theorem A.4), 


| N 
> pm e(Bm) = FONYU(N) — FMM) — [ Ut) fdr 


m=M +1 
Mi/k-1 
< 
[Bl 
< [p\'"* 
« min(P, |B|~"*). 
Therefore, 


N 
v(B) = ae em'le(Bm)+ > om'!le(Bm) 


mai & m=M+1 
« min(P, |B|-'/*). 
This completes the proof. 


Lemma 5.2 Let q and a be integers such that 1 < q < P’,0 <a<4q,and 
(a,q) =1.Ifa € Mg, a), then 


F(a) = (“22 ) v (« = = + O(P”’). 
q q 


Proof. Let 8 = a — a/q. Then || < P’~* and 


S(q, 
F(a) - a 
3 Ain 1) 52) In 1/k- | e(Bm) 
m=] m=! 
P 
-oe(= -) e(Bm ky _ eS hi 1/k— 1e(Bm) 
m= q m=] 
= . u(m)e(Bm), 
m=] 
where 
_ | e(am/q) — (S(q,a)/q)k-'m'"' if m is a kth power 
ny (S(q,a)/q)k'm'V/k} otherwise. 


We shall estimate the last sum. Let y > 1. Since |S(q, a)| < g, we have 


q 
> e(am‘/q) =) e(ar'/q) > 1 
r=] 


l<msy Ismsy 
mzr (mod q) 
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= S(q, a) (2 + 011) 
S(q, 
=y (=?) + O(q). 


Lett > 1. Since v(B) < P, we have 


u(t)= >> ulm) 
1<m<t 
= » e(am* /q) — “40 » cme 
l<mat!/k q lsmst 
= pi/k (2) + O(q) — (=) (t'/* + O(1)) 
q q 
= O(q). 


By partial summation, 


N N 
Y | u(m)e(Bm) = e(BN)U(N) - 2nip | e(Bt)U(t)dt 
1 


m=] 


N 
= 0(q) — 2xiB | e(Bt)O(q)dt 


«Kqt|BINq 

« (1+ (BIN) 

K (1+ P’-* P*) Pp” 
< Pp”, 


This completes the proof. 


Theorem 5.3 Let 


q KY , s 
6(N,0)= >. > (“2 ) e(—Na/q) 


I<q<Q a! 
(a,.q)=1 


and 
py-k 
rrny= [ v(Bye—npyap, 
— pr-k 
Let IN denote the set of major arcs. Then 


[ F(a)’ e(—Na)da = G(N, P”)J*(N)+ 0 (Ps~*-%) ; 


where 62 = (1 — 5v)/k > 0. 
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Proof. Let a € XJt(qg, a) and 


Let 


V = V(a,q,a) = 224 (a2) 2D cy, 
q q q 


Since |S(q, a)| < q, we have |V| < |v(B)| < P by Lemma 5.1. Let F = F(a). 
Then |F| < P. Since F — V = O(P”) by Lemma 5.2, it follows that 
Fi —V° = (F—V)(F°'+F°°V4---+V°7!) 


<x p2” ps7} 
= ps—it2v 


Since (IN) « P>”~* by (5.5), it follows that 
| [FS _ Vs da < Pp3v-k ps—l+2v a per, 
MM 
where 62 = 1 — 5v > O. Therefore, 
F(a)’ e(—Na)da 
Jo 


= I, V(a,q,a)’e(—Na)da + O (Ps-*-*) 


q 
= pa a i V(a, q,a)'e(—Na)da + O (Ree) 
(q,a) 


1<q<P’” a=0 
(a,q)=1 


For g > 2, we have 


V(a, q,a)’e(—Na)da 
he 


a/q+P’-* 
: | V(q,q,a)'e(—Na)do 
a/q—P*-* 


pr-k 
: | _V(B +a/q,q,a)'e(—N(B +.a/q))dB 


py- 
5 pr-k 
~ (22) e(-wajay f wpye(—Nevap 
a (=22) e(—Na/q)J*(N). 
For g = 1 we have V(a, 1, 0) = v(@) and V(a, 1, 1) = v(@ — 1). Therefore, 


Via, g, a)’e(—Na)da + | Via, g, a)’e(—Na)da 
Jones Mta,1) 
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pr-k 1 
= | v(a)e(—Na)da +/ v(a — 1)’e(—Na)da 
0 1—Ppr-* 
p»-k 0 
-[ weye-npyaps [ By'e-NA MB 


= J*(N). 


Therefore, 


| F(a) e(—Na)da 
mM 


q S 
-> > (=?) e(—Na/q)J*(N) + O (P°*"") 


l<qg<P”" 4a! 
=4= (a,q)=l 


= G(N, P’)J*(N) +0 (P°*”). 


This completes the proof. 


5.6 The singular integral 


Next we consider the integral 


1/2 
I(N) = | »(B)'e(—BN)dB. (5.8) 


1/2 
This is called the singular integral for Waring’s problem. 
Theorem 5.4 There exists 63 > 0 such that 
J(N) « Pe 


and 
J*(N) = J(N) + O (PS **)., 


Proof. By Lemma 5.1, 
1/2 
I(N) K | min(P, |p\!/*)'dp 
0 
1/N 1/2 
- | min(P, |p\"'/*)'dB + | min(P, |B|-"/*)'dB 
0 1/N 


1/N 1/2 
= | P'dB + | pap 
0 1 


/N 
<K ps-* 
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and 


J(N) — J*(N) = | v(B)°e(—NB)dB 


PY-k<|B\<1/2 


1/2 
« | wens 


Py-k 
<x Pk-vMs/k-1) 


s—k—5 
= P 3 


where 63 = v(s/k — 1) > 0. This completes the proof. 


Lemma 5.3 Let a and B be real numbers such that 0 < B < l1anda > B. Then 


Satay mete ne OLD 9 ye 
(a + B) 


where the implied constant depends only on B. 


m=] 


Proof. The function 
g(x) = xP'(N — x)" 


iS positive and continuous on (0, NV), integrable on [0, NJ, and 


N N 
i g(x)dx = [ xP—-l0N — x)*~!dx 
0 0 


1 
= No2te-1 [ Poa _ t)° dt 
0 


= NP" B(ar, B) 
_ yore D@(B) 
T(a+B) 


where B(a, 6) is the Beta function and I'(@) is the Gamma function. 


If a > 1, then 
; B-1 a-1 
f'a)= ee) (P= - )<o 
x N-x 


and so g(x) is decreasing on (0, NV) and 


N N-1 N-1 
| g(x)dx < Y= a(x) < | g(x)dx. 


m=1 


Therefore, 


N N-1 
0< | g(x)dx — )° g(m) 
m=1 
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1 
< | g(x)dx 
0 


1 
-| xP-1(N — x)*!dx 


0 
1 
< no! [ xP dx 
0 


Ne-! 
B 


If0 < B <a <1, then0 <a@+ 8 < 2 and g(x) has a local minimum at 


_ (-£p)N 
a e [N/2, N). 


Since g(x) is strictly decreasing for x € (0, c), it follows that 


[c] c 
Y= g(m) < | g(x)dx 
m=1 0 


and 
[c] [c] 
¥ * g(m) > | g(x)dx + g({c) 


m=1 
> | g(x)dx 
1 
Cc N&7! 
> g(x)dx — , 
J 4— 
Similarly, since g(x) is increasing for x € (c, N), it follows that 
N-1 N 
>> g(m) < | g(x)dx 
m=[c]+1 ¢ 
and 
N-1 N-1 
Y gm> | ede +elel+ 
m=[c]+1 [c]+1 
N-1 
> | g(x)dx 
N p-1 
N 
> | g(x)dx — ; 
c a 
Therefore, 


N N-1 a1 p-1 a—1 
N N 2N 
0< | g(x)dx — ) g(m) < + < , 
0 m= B a B 


This completes the proof. 
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Theorem 5.5 Ifs > 2, then 


1\° S\-1 
_ a s s/k—1 (s—1)/k-1 
J(N) r (1+) r(;) NSE-V4.0 (N ). 


Proof. Let 1/2 
J,(N) = / _ BYe(—NBMp 


for s = 1. We shall compute this integral by induction on s. Since 


N 
(8) = D> -m'"e(Bm), 


m=] 
it follows that 
N N 
v(B) =k De Sm + ms)! e((my +--+ + ms) B) 
m,=1 m,=] 
and so 


N N 1/2 
Jo(N) =? YD Ym sem felons +--+ my — NBS 
~1/2 


= k75 > (m, ---m,)i/k-}, 


my +---+nts=N 
l<m; <N 


In particular, for s = 2, we apply Lemma 5.3 with a = 6 = 1/k and obtain 


N-1 
J>(N) = k7? > m\/k-lon _ m)'/k-1 


m=] 
(1/k)?T(1/k)* 
~  P(2/k) 
(1+ 1/k)? 
~~ P(2/k) 


N2/k-1 + O(N */*-!) 
N2/k-1 + O(N {E}), 


This proves the result in the case where s = 2. 
If s > 2 and the theorem holds for s, then 


1/2 
Ions(N) = | _ Uy" e(-NBMap 


1/2 
/ _ 2 Av(B)'e(-NB MB 


12 Ny 
[do pm'e(amyn(By'e(—N BAB 


1/2 m=] k 


5.7 The singular series 137 
“1 1/k-1 W? 
=n / v(B)'e(—(N — m)B)dB 
— /2 
m'/k-1 7.(N — m) 


1 
=m iN _ mys/-} 


_ Pd + VS 
7 k 


I (s/k) 


m=1 


N-1 
l 
+O am VEN — myO- DEH YD 
(> zm NN ~ m) 


Applying Lemma 5.3 to the main term (with a = s/k and B = 1/k) and the error 
term (with a = (s — 1)/k and B = 1/k), we obtain 


> em' Ny _ m)s/k-! _ C/K)POLRIES/K) a osty/k-1 +O (Ns/*") 


m=l I((s + 1)/k) 
and 
N-1 4 
> pm (N _ m)&—D/k-1 =O (Ns/*-") 
m=) 
This gives 


(1/K)PO/ET(8/K) P+ W/ KY esses 9 (ws/k-1) 
M((s + 1)/k) I (s/k) 


= P+ 1k ett +O (Ns/k-1) 
((s + 1)/k) ; 


Tsai (N) = 


This completes the induction. 


5.7 The singular series 


In Theorem 5.3, we introduced the function 


G(N,Q)= > Av), 


l<q<Q 
where ; 
S(q, ° —N 
moe & (882) (24) 


(a,q)=1 


We define the singular series for Waring’s problem as the arithmetic function 


G(N) = )> Av@). 


q=1 
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Let 


1 
O<e< —., 
SK 


Since s > 2 +1=2K +1, we have 


~ —1-se>14+——se=144, 
K K 


where 


] 
64=— —SE > 0. 
K 


By (5.7), 


gq ] 
An(q) « gs/K-s = qitsa’ 


(5.9) 


and so the singular series 4 An(q) converges absolutely and uniformly with 
respect to N. In particular, there exists a constant cz = c2(k, s) such that 


IG(N)| < c2 (5.10) 
for all positive integers N. Moreover, 


G(N) — G(N, P”) = D> Ang) 


q>P’ 


« — 7 +54 


q>P* q 
<K Pp~vea. 


We shall show that G(2V) is a positive real number for all N and that there exists 
a positive constant c, depending only on k and s such that 


0<c, < G(N) <c 


for all positive integers N. The proof is a nice exercise in elementary number 
theory. We begin by showing that Ay(q) is a multiplicative function of q. 


Lemma 5.4 Let (q,r) = 1. Then 
S(qr, ar + bq) = S(q, a)S(r, b). 


Proof. Since (q,r) = 1, the sets {xr : 1 < x <q} and{yg:1< y <r}are 
complete residue systems modulo q andr, respectively. Because every congruence 
class modulo gr can be written uniquely in the form xr + Yq; where 1 < x <q 
and 1 < y <7, it follows that 


qr k 
S(qr, ar + bq) = > e (cee) 


m=] qr 
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(ar + we + an) 


( 
: s a ( (a2 + 10) 3 (') ry! oat) 
( 


£=0 


qd r 
-\ ye (SP) ( Cor) + (yg) )) 
-> - (2). © 
1 


(q, a)S(r, b). 
This completes the proof. 
Lemma 5.5 [f(g,r) = 1, then 
An(qr) = An(qQ)An(7), 
that is, the function Ay (q) is multiplicative. 


Proof. If c and qr are relatively prime, then c is congruent modulo qr to a 
number of the form ar + bq, where (a, gq) = (b, r) = 1. It follows from Lemma 5.4 


that 
qr 
An (qr) = » (“ee 2). (-<<) 


c=l 
(c,gr)=1 


4 “ae ar + bq) (ar + 20S 
“OE (A) (OS) 


a=] b=] 
(a,q)=1 (6,q)=1 


EE ny (ey, CT)F) 


(a.qg)=) (b.q)=1 


ECE) CE CY 28 


(a,q)=1 (b,q)=1 


= An(q)An(7). 


This completes the proof. 
For any positive integer g, we let My(q) denote the number of solutions of the 
congruence 


xf+---+x£=N (mod q) 


in integers x; such that 1 < x; < q fori =1,...,q. 
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Lemma 5.6 Lets > 2* +1. For every prime p, the series 


0O 
xn(p)=1+ >> Ay(p") (5.11) 
h=1 
converges, and 
n(p") 
xn(p) = lim phe) (5.12) 


Proof. The convergence of the series (5.11) follows immediately from inequal- 
ity (5.9). If (a, gq) =d, then 


i ax* 4 (a/d)x* 
Saa)= Due (>) Ye( q/d 


x=] x=] 
q/d k 
d 
=-d\ e (‘ a ) = dS(q/d,a/d). 
x=] q/d 
Since 
lyn, (am) _[ 1 ifm=0 (mod gq) 
q4+ \q/ | 0 ifm#0 (mod q), 
it follows that for any integers x,,..., x; 


LQ (axp+---+xf-—N)\ [1 ifxt+---+x4=N (mod q) 
e ~ | 0 ifxk+--.4+x*N (mod q) 


q q 
and so 
4 Pi (act+---+x* —N) 
Myla)= Yoo 2 oe (SE) 
x,=1 x,=1 4 a=] q 
1 <<! q Ki... 4xkK —N 
Lo e(S) 
q a=] x,=1 X;=1 q 
“23S (th)... Se(B2)e(=22) 
q a=1 x,=1 q x;=1 q q 
1 —aN 
- = sa.a¥e( ° ) 
q ‘G=l q 
a ~ 
= — > s(q.ay'e( ) 
q d\q a=! 
(a,q)=d 


4 \ =i) 
— } ad’ S(q/d,a/d —_—_—- 
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A. Slt id) (=</2™) 
rea gid) “\ Gia 


_i 
q 
=q°' >> Ay(@/d). 


d\q 


Therefore, 


)> An(q/d) = 4" *Mn(q) 


d\q 


for all g > 1. In particular, for g = p” we have 
h ° 
1+ 0 Ay(p!) = >> An(p"/d) =p" My (p") 
j=l d|p* 
and so 


h 
xw(p) = Jim (: +> ani) 


j=l 
= limp"? My(p"). 
This completes the proof. 


Lemma 5.7 [fs > 2* +1, then 


GN) =| | xw(p). (5.13) 
P 


Moreover, there exists a constant cy depending only on k and s such that 
0 < G(N) < c2 
for all N, and there exists a prime py depending only on k and s such that 


1/2 < |] xw(p) $ 3/2 (5.14) 
P> Po 


forall N > 1. 


Proof. We proved that if s > 2* + 1, then 
] 
An (q) < q!*54 9 


where 54 depends only onk and s, and so the series )~ q An (q) converges absolutely. 
Since the function A y(q) is multiplicative, Theorem A.28 immediately implies the 
convergence of the Euler product (5.13). In particular, x,(p) + O for all N and 
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p. Since xn(p) is nonnegative by (5.12), it follows that x,(p) is a positive real 
number for all N and p, and so the singular series G(N) is positive. Again, by (5.9), 


a 
0< O(N) < = 2 < 00 
q +54 
q=1 
and 
Ixw(p) ~ 1] > |An(p")| « > oiiny X iam," 
= =1 


Therefore, there exists a constant c depending only on k and s such that 
Cc Cc 
1 Gives = Xn(P) S V+ ee 
for all N and p. Inequality (5.14) follows from the convergence of the infinite 
products | | pl = cp~'~*4), This completes the proof. 

We want to show that G(NV) is bounded away from 0 uniformly for all N. By 
inequality (5.14), it suffices to show, for every prime p, that xy(p) is uniformly 
bounded away from 0. 

Let p be a prime, and let 

k= Dp’ ko, 


where t > O and (p, ko) = 1. We define 


_jf t+l ifp>2 
YY") c4+2. if p=2. 


Lemma 5.8 Let m be an integer not divisible by p. If the congruence x* = m 
(mod p”) is solvable, then the congruence y* = m (mod p") is solvable for 
everyh>y. 


Proof. There are two cases. In the first case, p is an odd prime. Forh > y = t+1, 
we have 


(k, p(p")) = (kop, (p — 1)p""') = (ko, p — 1)p" = (k, @(p")). 


The congruence classes modulo p" that are relatively prime to p form a cyclic 
group of order g(p") = (p — 1)p"~. Let g be a generator of this cyclic group, 
that is, a primitive root modulo p”. Then g is also a primitive root modulo p”. Let 
x* =m _ (mod p”). Then (x, p) = 1, and we can choose integers r and u such 
that 

x=g" (mod p") 


and 
m=g'" (mod p"). 


Then 
ku=r (mod g(p’”)), 
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and so 
r=0 (mod (k, g(p”))) 


and 
r=0 (mod (k, g(p"))). 


Therefore, there exists an integer v such that 
kv=r (mod g(p")). 


Let y = g’. Then y‘ =m _ (mod p’). 

In the second case, p = 2 and so m and x are odd. If t = O, then k is odd. 
As y runs through the set of odd congruence classes modulo 2”, so does y*, and 
the congruence y*’ = m (mod 2") is solvable for all h > 1. If t > 1, thenk 
is even and m = x* = 1 (mod 4). Also, x* = (—x)*, and so we can assume 
that x = 1 (mod 4). The congruence classes modulo 2" that are congruent to 
1 modulo 4 form a cyclic subgroup of order 2”~?, and 5 is a generator of this 
subgroup. Choose integers r and u such that 


m=5" (mod 2") 


and 
x =5" (mod 2"). 


Then x* =m _ (mod 2”) is equivalent to 
ku=r (mod 2”~), 


and so r is divisible by (k, 27) = 2* = (k, 2’-*). It follows that there exists an 
integer v such that 
kv=r_ (mod 2"~?), 


Let y = 5”. Then y‘ =m (mod 2"). This completes the proof. 


Lemma 5.9 Let p be prime. If there exist integers a,,..., as, not all divisible by 
p, such that 
aj+---+a’=N (mod p’), 


then ; 
Xn(P) 2 pra-s) > 0. 


Proof. Suppose thata,; #0 (mod p). Leth > y. Foreachi =2,..., 5 there 
exist p”-” pairwise incongruent integers x; such that 


x; =a; (mod p”). 


Since the congruence 
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is solvable with x; = a; # 0 (mod p), it follows from Lemma 5.8 that the 
congruence 
xf =N—xj—---—x* (mod p"). 
This implies that 
My(p") > p&-VS-9, 
and so 
My(p") I 


Xn(p) = lim pie) = Gren > 0. 


This completes the proof. 


Lemma 5.10 [fs > 2k fork odd ors > 4k for k even, then 
xw(p) = pY'~? > 0. 
Proof. By Lemma 5.9, it suffices to prove that the congruence 
ai+---+a,=N (mod p’) (5.15) 


is solvable in integers a; not all divisible by p. If N is not divisible by p and the 
congruence is solvable, then at least one of the integers a; is prime to p. If N is 
divisible by p, then it suffices to show that the congruence 


ai +---+a*_,+1*=N (mod p”) 


has a solution in integers. This is equivalent to solving the congruence 
k ko 
aj+---+a,_,=N-—1 (mod p”). 


In this case, (N — 1, p) = 1. Therefore, it suffices to prove that, for (NV, p) = 1, 
the congruence (5.15) is solvable in integers for s > 2k — 1 if p is odd and for 
s > 4k — 1 if pis even. 

Let p be an odd prime and g be a primitive root modulo p”. The order of g is 
y(p”) = (p — 1)p’~' = (p — 1)p". Let (m, p) = 1. The integer m is a kth power 
residue modulo p” if and only if there exists an integer x such that 


x*=m (mod p’). 


Letm = g’ (mod p”). Then m is a kth power residue if and only if there exists 
an integer v such that x = g” (mod p”) and 


kv =r (mod (p-—1)p’). 
Since k = kyp* with (ko, p) = 1, it follows that this congruence is solvable if and 


only if 
r=0 (mod (ko, p — 1)p*), 
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and so there are 
g(p")  _ pi 
(ko, p— pt (ko, p— 1) 


distinct kth power residues modulo p”. Let s(N) denote the smallest integer s 
for which the congruence (5.15) is solvable, and let C(j) denote the set of all 
congruence classes N modulo p” such that (NV, p) = 1 and s(N) = j. In particular, 
C(1) consists precisely of the kth power residues modulo p”. If (m, p) = 1 and 
N’ = m*N, then s(N’) = s(N). It follows that the sets C(j) are closed under 
multiplication by kth power residues, and so, if C(/) is nonempty, then |C(/)| => 
(p — 1)/(ko, p — 1). Let n be the largest integer such that the set C(n) is nonempty. 
Let 7 <n and let N be the smallest integer such that (NV, p) = 1 and s(N) > j. 
Since p is an odd prime, it follows that N — i is prime to p fori = 1 or 2, and 
s(N —i) < j. Since N =(N — 1)+1* and N =(N — 2)4+ 1‘ +15, it follows that 


jt+1<s(N) <s(N —-i)+2<j+2 


and so s(N — i) = j or j — 1. This implies that no two consecutive sets C(j) are 
nonempty for j = 1,...,m, and so the number of nonempty sets C(/) is at least 
(n + 1)/2. Since the sets C(j) are pairwise disjoint, it follows that 


r n+1 p-1 
(p—1)p' =9(p")= IC()| = —— -—— 
POE EO? X 2 (kh. p—) 
Cj) 
and so 


n <2(ko, p—1)p’ —1<2k—-1. 


Therefore, s(N) < 2k — 1 if p is an odd prime and N is prime to p. 

Let p = 2. If k is odd, then every odd integer is a kth power residue modulo 2”, 
so s(N) = 1 for all odd integers N. If k is even, then k = 2°ky with t > 1, and 
y =t+2. Wecan assume that 1 < N < 2” —1.If 


s=2” —1=4-2' —1< 4k —-1, 


then congruence (5.15) can always be solved by choosing a; = 1 fori = 1,...,N 
and a; = 0 fori = N+1,...,s5. Therefore, s(N) < 4k — 1 for all odd N. This 
completes the proof. 


Theorem 5.6 There exist positive constants c, = c\(k, s) and cz = C2(k, Ss) such 
that 


Cc) < G(N) < cp. 


Moreover, for all sufficiently large integers N, 


G(N, P’) =6(N) +0 (P74). 
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Proof. The only part of the theorem that we have not yet proved is the lower 
bound for G(NV). However, we showed that there exists a prime po = po(k, s) such 
that 


1/2 < || xw(p) < 3/2 


P> Po 
for all N > 1. Since 
xn(p) =p” > 0 


for all primes p and all N, it follows that 
1 1 
6(N) =|] xw(p) > 5 [] xv) = 5 [] p= a1 > 0. 
P PSPo PSPo 


This completes the proof. 


5.8 Conclusion 


We are now ready to prove the Hardy—Littlewood asymptotic formula. 


Theorem 5.7 (Hardy-Littlewood) Letk > 2 ands > 2* +1. Let r,.,(N) denote 
the number of representations of N as the sum of s kth powers of positive integers. 
There exists 5 = 6(k, s) > O such that 


1\° —1 
rz.s(N) _ G(N)I (1 + z) r (=) N&/*)-1 + O(N&/H)-1-8) 


where the implied constant depends only on k and s, and G(N) is an arithmetic 
function such that 
Cc) < G(N) <c 


for all N, where c, and cz are positive constants that depend only on k and s. 


Proof. Let 69 = min(1, 5), 52, 63, vd4). By Theorems 5.2—5.6, we have 


1 
rN) = [ F(a)’e(—aN)da 
0 


= is F(a)’e(—aN)da + [ F(a)’ e(—aN)da 
m 
= G(N, P’)J*(N) + O (P**~”) + O (PS**) 
= (G(N) + O (P~***)) (J(N) + O (P°-**)) + O (PS*-”) 
+O (Ps) 
= 6(N)J(N) + O (P**~*) 


1\" s\n! 
_ _ _ s/k—1 (s—1)/k—1 
= 6(N)I (1+;) r (5) NS" + O(N ) 
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+0 (Ns/k-1—5o/h) 


1\° _s\7! 
_ _ _ s/k—1 s/k-1—6 
-oqwyr (1+) r (>) NU! + O(N ), 


where 6 = 59/k. This completes the proof. 


5.9 Notes 


The circle method was invented by Hardy and Ramanujan [50] to obtain the asymp- 
totic formula for the partition function p(V), which counts the number of unordered 
representations of a positive integer N as the sum of any number of positive inte- 
gers. The circle method was also applied to study the number of representations of 
an integer as a sum of squares. See, for example, Hardy [45], and the particularly 
important work of Kloosterman [71, 72, 73]. 

In a classic series of papers, “Some problems of ‘Partitio Numerorum’,” Hardy 
and Littlewood [47, 48] applied the circle method to Waring’s problem. Vino- 
gradov [131, 134, 135] subsequently simplified and strengthened their method. 
This chapter gives the classical proof of the Hardy—Littlewood formula for s > 
so(k) = 2* +1. There is a vast literature on applications of the circle method to War- 
ing’s problem as well as to other problems in additive number theory. The books 
of Davenport [18], Hua [64], Vaughan [125], and Vinogradov [135] are excellent 
references. 

There have been great technological improvements in the circle method in re- 
cent years, particularly by the Anglo-Michigan school (for example, Vaughan and 
Wooley [126, 127, 128, 129, 130, 147, 148]). In particular, Wooley [146] proved 
that 


G(k) < kdogk + log logk + O(1)). 


Another interesting recent result concerns the range of validity of the Hardy— 
Littlewood asymptotic formula. Let G(k) denote the smallest integer so such that 
the Hardy-Littlewood asymptotic formula (5.1) holds for all s > so. Ford [41] 
proved that 


G(k) < k*(logk + log logk + O(1)). 


For other recent developments in the circle method, see Heath-Brown [54, 55], 
Hooley [59, 60, 61], and Schmidt [107]. 


5.10 Exercises 


1. Show that for k = 1 the Hardy—Littlewood asymptotic formula is consistent 
with Theorem 5.1. 
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2. Let k => 2. Show that the number of positive integers not exceeding x that 


can be written as the sum of k nonnegative kth powers is x/k!+O (x@—)/*), 
Show that 
G(k) >k+1. 


Hint: If n < x is asum of k kth powers, then 


~ ak k k 
N=aA,; +a, +°::+aQ,, 


where 


1/k 
O<a, <a, <---<a<x'"*, 


and the number of such expressions is given by a binomial coefficient. 


. Let f(x) be a polynomial of degree k > 2 with integral coefficients, and let 


q 


S;(q,a) =) e(af(r)/q). 


r=] 


Prove that if (¢, 7) = 1, then 


S(qr, ar + bq) = S¢(q, a)S f(r, b). 


. Let Rx;(N) denote the number of representations of an integer N as the 


sum of s nonnegative kth powers. State and prove an asymptotic formula 
for R k,s (N ). 


Part Il 


The Goldbach conjecture 


6 


Elementary estimates for primes 


Brun’s method is perhaps our most powerful elementary tool in num- 
ber theory. 


P. Erdos [34] 


6.1 Euclid’s theorem 


Before beginning to study sums of primes, we need some elementary results about 
the distribution of prime numbers. 

Let s = 0 +it be a complex number with real part o and imaginary part t. To 
every sequence of complex numbers a}, a2, ... is associated the Dirichlet series 


n=] 


If the series F(s) converges absolutely for some complex number so = do + if, 
then F(s) converges absolutely for all complex numbers s = o + it with R(s) = 


Oo > 09 = K(5So), since 
Qn| _ |@nl — lanl _ 
n? — n?o 


An 


ns nso 


If we let a, = 1 for all n > 1, we obtain the Riemann zeta-function 


c= =. 
n=] 
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This Dirichlet series converges absolutely for all s with R(s) > 1. 
Theorem 6.1 Let f(n) be a multiplicative function. If the Dirichlet series 


Ore ee 


n=] 


converges absolutely for all complex numbers s with R(s) > oo, then F(s) can be 
represented as the infinite product 


2 
F(s) = MG+4 Ae LP a). 


p* 


If f (n) is completely multiplicative, then 


= 
rot) 
P 


This is called the Euler product for F(s). 


Proof. If f() is multiplicative, then so is f(n)/n*.If f(n) is completely multi- 
plicative, then sois f(n)/n°. The result follows immediately from Theorem A.28. 

Because the Riemann zeta-function converges absolutely for R(s) > 1, it 
follows from Theorem 6.1 that ¢(s) has the Euler product 


for all s with St(s) > 1, and so ¢(s) #0 for R(s) > 1. From the Euler product, we 
obtain the following analytic proof that there are infinitely many primes. 


Theorem 6.2 (Euclid) There are infinitely many primes. 


Proof. For 0 < x < 1 we have the Taylor series 


x" 


—log(1 — x) = ae ae 


n=] 


Ifo > 0, then ¢(1+0) > 1 and 


-1 
log¢(1+0) = log] | (1 = == 
Pp 
a dle ( = a 
me iil aah 


p n=l 


~ pe p'*¢ +) —a = ° 


Pp n=2 
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Since 
> ee ee 
< —_—_—- < — = —_—_—— <Q, . 
5 ad np” +o) = p” > p(p — 1) 
it follows that 1 
logg(l+o)=)_ pie * O(1). (6.2) 
Pp 


Let 0 < o < 1. Then 


1 a | 1 
l<-= —dx <t(l+o0)<1+ dx =—+1] 
oO ; x ito , x l+o oO 


and so 
1 
0 < log— < logg(l +o) 
of 
1 1 
< log (< + 1) = log — + log(1 +a) 
of of 
1 1 
< log —t+o < log— +1. 
ro) o 
Therefore, 


1 
log ¢(1 +0) = log = + O(1). (6.3) 
Combining (6.2) and (6.3), we obtain 


1 1 
log—-=)4 ——+0O(l 
85 = Da pine FO) 


for 0 <o < 1. If there were only finitely many prime numbers, then the sum on 
the right side of this equation remains bounded as o tends to 0, but the logarithm 
on the left side of the equation goes to infinity as o tends to 0. This is impossible, 
so there must be infinitely many primes. 


6.2 Chebyshev’s theorem 


The simplest prime-counting functions are 


n(x)= 01, 


psx 


B(x) = | log P, 


psx 


and 


W(x) = }° log p. 
p 
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v(x) and w(x) are called the Chebyshev functions. Chebyschev proved that the 
functions 0(x) and w(x) have order of magnitude x and that 2(x) has order of 
magnitude x/log x. Before proving this theorem, we need the following lemma 
about the unimodality of the sequence of binomial coefficients. 


Lemma 6.1 Letn > 1 and1<k <n. Then 


n n\ . nal 
(, _ .) < (7) if and only ifk < *-, 


(1) > (7) if and only ifk > mt 


(, " ) = @ if and only if n is odd and k = a 


Proof. This follows immediately from observing the ratio 


n nt lin — ky! 
ca) (k—1)\(n—k+1)! k(n — k)! k 


Lemma 6.2 Letn > 1andWN = 7"). Then 
N <2” <2nN. 


Proof. Since (*") is the middle, and hence the largest, binomial coefficient in 
the expansion of (1 + 1)?”, it follows that 


N = (*") < (1+ 1)" =2" 
2n ") 2n—1 (") 
= =1+ +] 
k=O (; » k 
<24+(2n— 0(7") <2n (”") 
n n 


= 2nN. 


This completes the proof. 
For any positive integer n, let v,(n) denote the highest power of p that divides 
n. Thus, v,(n) = k if and only if p*||n. In this case, p‘ < n and so v,(n) < 


log n/ log p. 


Lemma 6.3 For every positive integer n, 


OT » llogn/logplr 
vin) = | |= > =]. (6.4) 


k=1 k=] 
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Proof. Since v,(mn) = v,(m)v,(n) for all positive integers m and n, we have 


vin!) = Yo vm)= I= Y= [FI 


m=] m=] pam k=] m= 
pk\m 


This proves the formula. 


Theorem 6.3 (Chebyshev) There exist positive constants c, and cz such that 


c\x < (x) < W(x) < w(x) logx < cox 


for all x > 2. Moreover, 


v ] 
tim inf 2 = jim int Y& = tim int 78* > tog 2 
x00 xX X00 xX x00 xX 
and 9 I 
lim sup ve) = lim sup vo) = lim sup (x) log x < 4log 2. 
x>00 06 X x00 x X—>00 x 


Proof. Let x > 2. If p* < x, then k < [log x/ log p], and so 


B(x) =) log p < W(x)= )) logp = ) | ice 8? 


psx pk <x psx 


< Y “log x = (x) logx. 


psx 
Therefore, 
v 1 
tim int Oo < timint M& < timing 7 *O8* 
X00 xX X—> 00 xX X—>00 
and 9 i 
lim sup Pe) < lim sup ve) < lim sup (x) Tog x 
x00 x x00 x x— 00 
Let 
0<6 <1. 
Then 
Hx)=> >> logp 
xi-S<p<x 
> > (1-4d)logx 
xi-S<p<x 


(1 — 8) (x(x) — w(x'~*)) log x 
> (1 — 8)m(x) log x — x! log x, 


(6.5) 


156 6. Elementary estimates for primes 


and so 
0(x) 7 (1 —4)m(x)logx logx 
x xo 
It follows that ‘ 
nie Gea 
x00 Xx x00 x 


This holds for all 5 > 0, and so 


inane 2? Stinne  E 
x00 x X—>00 Xx 
Similarly, 
v ] 
lim sup a) > lim sup nee 
X—> XO x X= 0O 
Therefore, 
lininf 2 mine aii 
x—>00 xX X—>00 x xX—> 00 Xx 
and 
; O(x) w(x). u(x) log x 
lim sup —— = lim sup —— = lim sup —————_.. 
x00 x x—>00 x x00 


Let n > 1, and let 


hs ao _ 2n(2n — 1)(2n — 2)---(n +1) 
Nan) n! 


Then JN is an integer, since it is a binomial coefficient, and 


qan 
i Noe 
2n 


by Lemma 6.2. If p is a prime number such that 


n<p<2n, 


(6.6) 


(6.7) 


then p divides the numerator but not the denominator of N. Therefore, N is 


divisible by the product of all these primes, and so 


I] p<N <2”. 


n<p<2n 
In particular, ifr > 1 andn = 2’~', then 


F 
I] p<N< 2. 
2°-l<p<2" 


It follows that, for any R > 1, 
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For any number x > 2, there is an integer R > 1 such that 


QR-lex <2. 


Then es, 
|[e< [[ e<2- < 2*, 
psx p<2® 
and so 
Hx) =} log p = log (I ) < (4log2)x. 
psx psx 
Thus, 


v 
lim sup Oe) < 4log2. 
X—>00 x 


To obtain the lower limit, we use Lemma 6.3 to express N explicitly as a power 


of primes: 
2n (2n)! vp(2n)—2v,(n) 
v= (7) -Sie > [pre 
p<2n 

where a) 

n n 
vp(2n) — 2up(n) = 3 (ea ? |) | 

I<k< 722 P ° 


Since [2t] — 2[t] = 0 or 1 for all real numbers tf, it follows that 


log 2n 
2n) —2 < . 
Up(2n) — 2v,(n) < log p 
By Lemma 6.2, 
q2n log 2n 
_ v,(2n)—2vu,(n) > _ (2n) 
a <N=]] 2” P < [| p™ < |] 2n= ny)" 


pS2n p<2n p<2n 
or, equivalently, 
m(2n)log2n < 2n log 2 — log 2n. 
Let n = [x /2]. Then 


2n<x<2n+2 


and 


m(x)logx > 2(2n) log 2n > 2n log 2 — log 2n 
> (x — 2)log2 — logx =x log2 — logx — 2log2. 


It follows that 
(x) log x 


] + 2 log 2 
> log2 — 22 2 08“ °& 
x 
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and so 


lim inf 
x00 


(x) log x > log2 
— 2 , 


Since 3(2) > 0, we have #(x) > c,x for some c; > O and all x > 2. This 
completes the proof. 


Theorem 6.4 Let p, denote the nth prime number. There exist positive constants 
c3 and c4 such that 
c3n logn < pn < canlogn 


for alln > 2. 
Proof. By Chebyshev’s inequality (6.5), 

Et <n(p,)=n< 

log Pn log Pn 
and so 

cy 'nlog Pn < Pn < cy'nlog pn. 
Since 
log n < log Pn ) 

we have 


Pn = cy'n logn = c3n logn. 
For n sufficiently large, 
log Dn < logn + log log p, + log cy! 
< logn + 2 log log p,, 
< logn + (1/2) log pn, 
sO 
log Dn < 2logn 


and 
Pn = cy 'n log Pn < 2cy'n logn. 


Therefore, there exists a constant c4 such that p, < can logn for all n > 2. This 
completes the proof. 


6.3. Mertens’s theorems 


In this section, we derive some important results about the distribution of prime 
numbers that were originally proved by Mertens. 


Lemma 6.4 For any real number x > 1 we have 
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Proof. Since the function h(t) = log(x/t) is decreasing on the interval [1, x], it 
follows that 


> log () < logx +f log (=) dt 


l<n<x 


= xlogx -| log tdt 

1 
= x logx — (x logx —x +1) 
<X. 


This completes the proof. 
The function A(n), called von Mangoldt’s function, is defined by 


_ | logp ifn =p” isa prime power 
A(n) = | 0 otherwise. 


Then 
wix)= D> Am). 


1<m<x 
Theorem 6.5 (Mertens) For any real number x > 1, we have 


“”) = logx + O(1). 


n<x 


Proof. Let N = [x]. Then 


N 
0< log = = N logx — } J logn = x logx — log N! + O(log x) < x 
n 


n<x n=] 


by Lemma 6.4, and so 
log N! = x logx + O(x). 


It follows from Lemma 6.3 and Theorem 6.3 that 


log N! = > v,(N) log p 


pen 
[log N/ log p) N 
->) Dd Faz 
p<N k=l P 
N 
= > = | log p 
pk<N Pp 
x 
- | flee 
pk<x p 
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-)> (= + 0(1)) A(n) 


nsx 


=x) “ +0 (x ro) 


n<x n<x 


=x) e + 0(W(x)) 


n<Xx 


A(n) 
= O(x). 
x dX + O(x) 
Therefore, 
A(n 
Do a + O(x) =x logx + O(x) 
and 


> “ = logx + O(1). 


Nn<x 


This completes the proof. 
Theorem 6.6 (Mertens) For any real number x > 1, we have 
] 
yee =logx + O(1). 
psx 
Proof. Since 


0< <p“ - 5. OEP 


n<x psx P 


_ ee 


pk <x 
"22 


ae 
< doer De 


psx k=2 


log p 
~ ax P(p — I) 


= O(1), 


it follows from Theorem 6.5 that 


y— log P -y ae + O(1) = logx + O(1). 


psx P n<x 
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This completes the proof. 


Theorem 6.7 There exists a constant b, > 0 such that 


1 1 
Yb a togiogs +b +0 ( ) 
p log x 


psx 


for x > 2. 


Proof. We can write 


srt ay Pe Sums, 


1 
pax P psx P log p n<x 


where 
28P ifn=p 
-— Pp 
un) | 0 otherwise 
and 
f()=— 
logt 


We define the functions U(t) and g(t) by 
log p 
Ui) =o un)= > — = logt + g(t). 
n<t p<t 


Then U(t) = 0 fort < 2 and g(t) = O(1) by Theorem 6.6. Therefore, the integral 
Jy° g(t)/(¢(og t)?)dt converges absolutely, and 


[ g(t)dt _ O ( 1 
x t(logt)? logx ]- 
Since f(t) is continuous and U (ft) is increasing, we can express the sum >> p<x 1/P 


as a Riemann-Stieltjes integral. Note that U(t) = 0 fort < 2. By partial summa- 
tion, we obtain 


yo = = Shum fin) 


p<x n<x 


“s+ f(HdU(t) 
2 J2 


- fu) ~ | U(t)df (t) 


= logx + g(*) _ [ U(t) f'(t)dt 
log x ”) 


] * t 
-140(—— + / logt + 8(4) 1, 
log x >» t(logt)? 


162 6. Elementary estimates for primes 


x 1 1e.@) Le, @) 
-| ar+ | eat - | 8) ars1+0(— 
2 tlogt 2 t(logt)* x t(logt)? log x 


OO 
g(t) 1 
= log] — log log 2 ———-dt +1 ——- 
og log x og log +| ‘log ty + +o( 


1 
= loglogx +b, + O (--) ; 
log x 


where 
g(t) 
t(log t)? 


0O 
bh =1- loglog 2+ f 
2 


This completes the proof. 
From the Taylor series for log(1 — x), we see that 


o<tog(1-4) -Layr <)-- 


n=2 np" n=2 p" P(p ~ 1) 


(6.8) 


It follows from the comparison test that the series 


b=) > (ce (1 — 1\" *)- >, (6.9) 


P p mn * 
converges. 
Lemma 6.5 Let b, and b2 be the positive numbers defined by (6.8) and (6.9). 


Then 
b, +b2=y, 


where y is Euler’s constant. 
Proof. Let 0 < o < 1. We define the function F(a) by 

1 
F(a) = logg(1+0)— >> oie 


P 


Fla a) - Zt) 


~ » > oF 


Pp n=2 


By (6.1) and the Weierstrass M-test, the last series converges uniformly for 0 > 0 
and so represents a continuous function for 0 > 0. Therefore, 


im, F(a) = bo. (6.10) 


We shall find alternative representations for the functions log¢(1 + o) and 
>», P|’. Since 
2 o? 


Oo 
l-o+— <e°% <1l-o+— 
Oo aed < 7 
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for 0 < o < 1, it follows that 


oO l-—e oO 
1—-—< <l]-—-— 
2 oO 2e 
and 
1+— <1+ < <l+ <l+o. 
2e—oa 1—e? — 
Therefore, ; 
0<logo+log(l—e’) <<a, 
and so 


] 
log — = log(1 — e~°) '+0(0). 
oO 


By (6.3), we have 


logf(1+o0)= log — + O(c) 
= log(1 — e~’)7' + O() 


n=] 


—on 


+ O(c). 


By Theorem A.5, 


L(x) = Yt = togx +7 +0 (=) 


n<x 


forx > 1. Let f(x) = e~°*. By partial summation, we have 


log¢(l+o0)= ym 
n=] 

-[- f(x)dL(x) + O17) 
0 


+ O(c) 


= — | * L(x)df (x) + O(c) 


0 


=O [ e °*L(x)dx + O(0). 
0 


By Theorem 6.7, 


1 l 
S(x) = > > = loglogx +b, + O (==) 


psx 


for x > 2. Let g(x) = x~°. Again, by partial summation we have 


I g(p)_ [~ 00 
» pie X a -| g(x)dS(x) = -| S(x)dg(x) 


P 
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[ S(x)dx 
=O 
1 


x ito 


(o.@) 
-o | e °* S(e*)dx. 
0 


Since I 
S(e*) =logx +b, + O (~) 
x 
and I 
L(x) =logx+y+0O (=) ; 
x 


it follows that 


L(x) ~ Ste") =~) +0(~) =~ +0 (5) 


x x+1 


for x > 1. We also have 


L(x) - Se) =y —b + 0( ) 


x+1 


for 0 < x < 1. Therefore, 


1 
F(a) =logg(l+0)— )> oie 


P 


=O [ e °*(L(x) — S(e*))dx + O(a) 
0 


0° 1 
-o| e* (yn +0(—5)) dx +000) 
0 x+1 
CO ‘o@) ox 
-(y- bo | evdx+0(o | . *) +00) 
0 0 x+1 


©. @) OX d 
-y-b+0(o | =) +00. 
0 x+1]1 


[ eo" dx [° eX dx [ e dx 
< + 
0 x+1 0 x+1 1/o X 
[" dx [ e’dy 
< + 
0 x+1 1 y 
1 
= log (- + 1) + O(1) 
Oo 


1 
<« log (< 1), 
Oo 


Since 
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Flo)=y ~b1+0(alog(~+1)). 


By (6.10), we have 
bz = lim, F(a) =yYV- by. 


it follows that 


This completes the proof. 
Theorem 6.8 (Mertens’s formula) For x = 2, 
] —] 
I] (1 ~ “| = e’ logx + O(1), 
psx P 
where y is Euler’s constant. 
Proof. We begin with two observations. First, 


—. 1 l 
Dd <2 pe 


p>x k=2 


n>x 


ll 
ea) 
aN 
=| — 
Nee” 


lh 

je) 
—" 
\e) 
gq i= 
as 
Ne” 


Second, since exp(t) = 1+ O(t) for ¢ in any bounded interval and O (1/ log x) 1s 
bounded for x > 2, it follows that 


s»(o(<is))-12(} 


Therefore, 


=U Lie 


psx k=l P 


rere 


psx psx k=2 
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] | 
= loglogx +b; +O | —— by — > ) —- 


1 
= loglogx +y +0 ( ). 


since b; + bz = y by Lemma 6.5, and so 


1\7! 
I] (1 — =| = e” log x exp (0 
14 


psx ( 
] 
=e’ logx (: +O (=) 
log x 


=e’ logx + O(1). 


This is Mertens’s formula. 
The following result will be used in Chapter 10 in the proof of Chen’s theorem. 


Theorem 6.9 For any € > 0, there exists a number u, = u,(€) such that 


1\7 l 
Tl (1-<) <(1+e)——— 
Pp log u 


usp<Z 


for any u, <u < z. 


Proof. Let y be Euler’s constant, and choose 6 > 0 such that 


+6 
y <l+te. 
y—d 


By Theorem 6.8, we have 
1 —]1 
I] (1 — -) ~ y logx, 
p<x P 
and so there exists a number u, such that 
| —1 
(y — d)logx < I] (1 — | < (y +5) logx 
p<x 


for all x > u,. Therefore, 1f uw; < u < z, we have 


~ 1 
1 
peu (1 7 1) 
(vy +4) log z 
(vy — 5)logu 
] 
<(l+e)——. 
log u 


This completes the proof. 
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6.4 Brun’s method and twin primes 


There is a structural similarity between the twin prime conjecture and the Goldbach 
conjecture. The twin prime conjecture states that there exist infinitely many prime 
numbers p such that p + 2 is also a prime number or, equivalently, there exist 
infinitely many integers k such that k(k + 2) has exactly two prime factors. The 
Goldbach conjecture states that every even integer n > 4can be written as the sum 
of two primes or, equivalently, there exists an integer k such that] <k <n—-1 
and k(n — k) has exactly two prime factors. We begin the study of sieve methods 
with a simple proof of the theorem that the twin primes are sparse in the sense that 
the sum of the reciprocals of the twin primes converges. This contrasts with the 
result (Theorem 6.7) that the sum of the reciprocals of all of the primes diverges 
like log log x. 


Lemma 6.6 Jf£ > 1and0 <m < £, then 


sk (l\ am (2 - 1 
Sor()-ovr(,") 


Proof. This is by induction on m. It is easy to check that the equation is true for 
m=(,1,2.If 1 < m < @ and the equation holds for m — 1, then 


sr) Borer) 
dS 1) (,) - Oi Oe haGr oa Oe 


-cirs(t)er() 
-oin(()-(E3) 


£—-1 
-a"(“1), 
m 
This completes the proof. 
The following combinatorial inequality, a version of the principle of inclusion— 
exclusion, is the simplest form of the Brun sieve. 


Theorem 6.10 (The Brun sieve) Let X be a nonempty, finite set of N objects, 
and let P,,..., P, ber different properties that elements of the set X might have. 
Let No denote the number of elements of X that have none of these properties. 
For any subset I = {i,,..., iz} of {1,2,...,r}, let NZ) = N(ij,..., ix) denote 
the number of elements of X that have each of the properties P;,, P;,,..., P;,. Let 
N(O) = |X| = N. Ifm is a nonnegative even integer, then 


No < y(-1) > N(1). (6.11) 
k=0 


[J |=k 
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If m is a nonnegative odd integer, then 


No > S41) > N(1). (6.12) 


k=0 [T=k 


Proof. Inequalities (6.11) and (6.12) count the elements of X according to the 
various properties that each element possesses. We shall calculate how much each 
element of X contributes to the left and right sides of these inequalities. 

Let x be an element of the set X, and suppose that x has exactly @ properties 
P;. If € = 0, then x is counted once in No and once in N(Q@), but is not counted 
in N(J/) if J is nonempty. If 2 > 1, then x is not counted in No. By renumbering 
the properties, we can assume that x has the properties P;, Po,..., Pe. Let I © 
{1,2,...,@,...,r}. Ifi € I for somei > 2@, then x is not counted in N(J). If 
IC {j,2,..., 2}, then x contributes 1 to N(/). Foreachk =0, 1,..., 2, there are 
exactly () such subsets with |J| =k. If m > @, then the element x contributes 


Sow(l)- 


k=0 


to the right sides of the inequalities. If m < £@, then x contributes 


S.0( 
k=0 


to the right sides of inequalities (6.11) and (6.12). By Lemma 6.6, this contribution 
is positive if £ is even and negative if £ is odd. This completes the proof. 


Lemma 6.7 For x > 1 and for any congruence class a _ (mod m), the number 
of positive integers not exceeding x that are congruent toa modulo m is x/m+9, 
where |@| < 1. 


Proof. If x/m =q € Z, then the set {1, ... , gm} contains exactly x/m elements 
in every congruence class modulo m. 

Suppose that x/m ¢ Z. Let [x] and {x} denote the integer and fractional parts 
of x, respectively, and let [x] = gm +r, where 0 <r < m. Then 


qm<x=qm+rt+{x}<qm+(m—1)+0<(q+1)m, 


and so x 
q<—<gqtl. (6.13) 
m 


The positive integers up to x can be partitioned into q + 1 pairwise disjoint sets such 
that g of these sets are complete systems of residues modulo m, and the remaining 
set is a subset of a complete system of residues modulo m. It follows that there are 
either g or q + 1 integers in the congruence classa (mod m). The lemma follows 
from inequality (6.13). 
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Lemma 6.8 Letx > 1, and let p;,,..., pi, be distinct odd primes. Let N(i;,..., 
i,) denote the number of positive integers n < x such that 


n(n+2)=0 (mod pj, --: pi,)- (6.14) 


Then 
k 


2x k 
N(iy,..-, tg) = ————_ +2 6, 
Pi, *** Pix 
where |O@| < 1. 


Proof. If p is an odd prime and n(n +2) =O (mod p), then either 
n=O (mod p) 


or 
n=-—2 (mod p). 


Moreover, 0 4 —2 (mod p) since p > 3. If the integer n satisfies the congru- 


ence (6.14), then there exist unique integers u;,..., uz € {0, —2} 
n = &u&, (mod p;) 
n => 4&2 (mod P2) 
(6.15) 
n = ux (mod px). 
By the Chinese remainder theorem, for each of the 2* choices of uj,..., uz there 


exists a unique congruence class a (mod p; --: px) such that n is a solution of 
the system of congruences (6.15) if and only if 
n=a (mod pj p2--: Pk). 


By Lemma 6.7, this congruence has 


Xx 


——$§— +0@(a) 
P1P2°°* Pk 


solutions in positive integers not exceeding x, where |@(a)| < 1. Therefore, 


. . 2 k 
N(iy,..-, &) = ————  +2°6, 
Pi, eee Pi, 
where |9| < 1. This completes the proof. 


Theorem 6.11 (Brun) Let 2(x) denote the number of primes p not exceeding x 
such that p + 2 is also prime. Then 


x(log log x)? 


12(x) K (logx)? 
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Proof. Let 5 < y < x. Let r = m(y) — 1 denote the number of odd primes 
not exceeding y. We denote these primes by p),..., p,. Let m2(y, x) denote the 
number of primes p such that y < p < x and p +2 1s also prime. If y <n < x 
and both n and n + 2 are prime numbers, then n > p; fori = 1,...,7, and 


n(n+2)4#0 (mod p;) 

for all i. Let No(y, x) denote the number of positive integers n < x such that 
n(n+2) #0 (mod p;) 

for alli =1,...,r. Then 


W(x) < y+M(y, x) < y+ Noly, x). 


We shall use the Brun sieve to find an upper bound for No(y, x). 

Let X be the set of positive integers not exceeding x. For each odd prime 
Pi < y, we let P; be the property that n(n +2) is divisible by p;. For any subset J = 
{i,;,...,%,} contained in {1,...,7}, we let N(/) be the number of integers n € X 
such that n(n + 2) is divisible by each of the primes pj;,,..., pj, or, equivalently, 
such that n(n + 2) is divisible by p;, --- p;,. By Lemma 6.8, we have 

; 2 x k 
NU) = N(ij,..-,%%) = ——— +20. 
Pi, *** Di, 
Let m be an even integer such that 1 < m < r. By inequality (6.11), we have 


No(y,x) < ) (-1)' )) NW) 


k=0 [Tak 
. k 2tx k 
spe (ge 10) 


k=0 {i1,..,dJE{1,.. 


<x) > “or poe v'(; Joa’ 
7 =() 


k {i},-.., ix}C{l,..., r} Pi, * °° Di, 


=) a a eer 
=x ——— 
k=0 {i),..., ip}C{1,...,7} Pi ee Pi, 


- (—2)* “(r k 
- Yo Yo ~———+0() (2). 
kam+] {ij ssi¢}G(1,-.r} Pir” °° Pie “ao \K 


grees 


We shall estimate these three terms separately. By Theorem 6.8, 


r (—2)* 9) 
>> DY pom TT (1-5) 


K=O {iqyeeesig} (ly --nor 2<p<y P 
i 2 
<X I] (: — *) 
2<ps<y P 
x 
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Let s,(x;,...,x,) be the elementary symmetric polynomial of degree k in r 
variables. For any nonnegative real numbers x), ..., x, we have 


Sk(X1,...,%r) = > Xi, Xi, 


k 
c (x) +--+ +4,) 
— k! 
(811, .- +4 Xr) 
7 k! 


(2) si Ys 
< —_ S XxX 95 eee 9 Xr 
_) 1G 

since (k/e)* < k!. Therefore, 


' (—2)* 
x A 


kem+l {iy .nig}C(L.ur} Pa” Pie 
r ak 
<x 
kam! {iy,iJEll,...r} Pir °° * Pir 


- 2 2 
<x Y (F)-() 
kam+l {iysenig}e{L....r} © Pi Pi, 


=X > Sk F.] 


k=m+1 1 Pr 
<x 0 (f) (=...) 
koma °* Pi Pr 
-x > (Z) (S4-+5) 
comer “EK? \ Pt Pr 
r. (2e\* 1\‘ 
34 
<x (*) (> ;) 
k=m+1 m pay P 
r l ] k 
<x > (: og “= | 
k=m+1 m 


where c is an absolute positive constant. If we choose the even integer m so that 
m > 2c log log y, 
then 


". (cloglog y\* —~ 1 x 
x yy (eee Jes Da<e 


k=m+1 
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Since r is the number of odd primes less than or equal to y, it follows that 2r < y, 


and we get the following estimate for the third term: 


> (;)2 < )\(2r)‘ « (2ry" < y”. 


k=0 k=0 


Combining these three estimates, we obtain 


x x x x 

TAX) <Y+ GG t om +" <4 (ogyt amt)” 

where the implied constant is absolute, y is any real number satisfying 
S<y <x, 

and m is any even integer such that 


m > 2c log log y. 


Let c’ = max{2c, (log 2)~'}, and let 
ex log x ay 
= a = c’ log log x 
» P 3c’ log log x 


and 
m = 2[c’ log log x]. 


(6.16) 


(6.17) 


(6.18) 


The number y satisfies conditions (6.17) and (6.18) for x sufficiently large. We 


estimate the three terms in (6.16) with these values of y and m. Since 


logy = —e* _, 
3c’ log log x 
we obtain the main term 
x x(log log x)? 
(log y)? (log x)? 


Next, since c’ > (log 2)~! and 
m = 2[c’ log log x] > 2c’ log log x — 2, 

we obtain 

x 4x _ 4x c 4x 

am < J2c’ log log x 7 (log x)2e log2 — (log x)? ° 
Finally, 

y™ < ye loglogx _ exp 2c log log x log x _ 72/3. 
3c’ log log x 

Combining these three estimates, we obtain 


x(log log x)” 


T(x) K (log x)? 


This completes the proof. 
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Theorem 6.12 (Brun) Let pi, p2,... be the sequence of prime numbers p such 
that p + 2 is also prime. Then 


2 l 1 
(+5) 


n=] 


1 1 1 1 1 1 ] ] 
=(-+-—]+[{-+=)+(—+—]+(—+—]+-:- 
(; ;) (; | (+ 5) (3 5) 


< ©. 


Proof. Theorem 6.11 implies that 


X 
m0) & CogxyP 


for all x > 2. Therefore, 


Pn Pn 


= << 
"= (Pa) S oe 7, = Cogn) 


for n > 2, and so 


n=) Pn 3 n= Pn 3 n=2 n (logn))*/* 


converges. This completes the proof. 


6.5 Notes 


Dickson [22, vol. I, pp. 421-424] contains a brief account of early results con- 
ceming the Goldbach conjecture. Sinisalo [117] has verified the Goldbach con- 
jecture by computer for all even integers up to 4 - 10!!. Wang’s book Goldbach 
Conjecture [137] is an anthology of classic papers on this subject. 

Brun [7] obtained the first significant result concerning the Goldbach conjecture 
in 1920. By means of the combinatorial method known today as the Brun sieve, he 
proved that every sufficiently large even integer can be written as the sum of two 
integers, each of which is the product of at most nine primes. Brun also obtained 
the first nontrivial results concerning the twin prime conjecture. In addition to 
Theorem 6.11 and Theorem 6.12, he also proved that there are infinitely many 
integers n such that both n and n + 2 are the products of at most 9 primes. The 
application of the Brun sieve to the twin prime conjecture follows Landau [78]. 

By Theorem 6.12, the sum over the reciprocals of the twin primes converges. 
The sum of this infinite series is called Brun’s constant, its value is estimated to be 
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1.9021604 + 5 x 107’ (see Shanks-Wrench [112] and Brent [5]). It is a difficult 
computational problem to determine Brun’s constant to high precision. In the 
process of trying to improve the estimates for Brun’s constant, Nicely discovered 
a defect in Intel’s Pentium computer chip (see [15]). 

A popular game among computational number theorists is to find explicit ex- 
amples of twin primes. On October 18, 1995, Harvey Dubner announced over the 
Internet that p and p + 2 are prime numbers for 


p = 570, 918, 348 - 10°'9 — 1 = 27. 33. 7-11- 13-5281 - 10°!7° — 1, 


The prime p has 5129 digits. This established a new record for the largest twin 
prime. 

For other elementary results about the distribution of prime numbers, see Ellison 
and Ellison [29], Hardy and Wright [51], Ingham [66], and Tenenbaum [121]. 
Rosen [104] has generalized Mertens’s Theorem 6.8 to algebraic number fields. 


6.6 Exercises 


1. Let n be a positive integer. Prove that 


logn =) A(d) 


d\n 
and 
A(n) = — > LU(d) log d. 


d\n 


2. Let w(n) denote the number of distinct prime divisors of n. Let n > 2 and 


r > 0. Prove that 
Y> ud) <0< D> wd). 


d\n d\n 
w(d)<2r+] a(d)<2r 


3. With the notation of Theorem 6.10, prove that 
t 
No = )(-1) 5° NCD. 
k=0 [=k 
This formula is often called the inclusion—exclusion principle. 
4. Use the inclusion—exclusion principle to prove that 
1 d\ 
o(n)=n]](1-=) =n way 
p\n P d|n d 


where g(r) is the Euler g-function. 
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. Let ®(x, y) denote the number of positive integers n < x that are not 
divisible by any prime p < y. Prove that 


1 
o(x,y)=x]] (1 ~ ~) tVOLK ioe) +27), 


psy 
. Prove that 
r 1 
I] (1 _ “) « -, 
reper Pp (log x) 
. Prove that 
x\k 

d (log=) = dx +O ((logx)'). 

n<x n 
. Prove that 
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The Shnirel’man—Goldbach theorem 


Das allgemeine Problem der additiven Zahlentheorie ist die Darstell- 
barkeit aller naturlichen Zahlen durch eine beschrankte Anzahl von 
Summanden einer gegebenen Folge von natiirlichen Zahlen, z. B. der 
Primzahlfolge oder der Folge der p-ten Potenzen.! 


L. G. Shnirel’man [114] 


7.1. The Goldbach conjecture 


In a letter to Euler in 1742, Goldbach conjectured that every positive even integer 
n > 2 1s the sum of two primes. Euler replied that he believed the conjecture 
but could not prove it. It is still unproven, but it has been confirmed by computer 
calculations for even integers up to 4- 10!!. 

In 1930, Shnirel’man proved that every integer greater than one is the sum of 
a bounded number of primes. This is a great theorem, the first significant result 
on the Goldbach conjecture. Shnirel’man used purely combinatorial methods: the 
Brun sieve and a theorem about the density of the sum of two sets of integers. 
We shall prove Shnirel’man’s theorem in this chapter. Instead of the Brun sieve, 
however, we shall use a sieve method due to Selberg, which is also completely 


'The general problem in additive number theory is the representation of the natural 
numbers as the sum of a bounded number of terms from a given sequence of natural numbers, 
e.g. the sequence of prime numbers or the sequence of p-th powers. 
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elementary but more elegant and in many cases more powerful than Brun’s original 
sieve argument. 


7.2. The Selberg sieve 


Lemma 7.1 (Cauchy—Schwarz inequality) Let a,, ..., dy, bj, ..., bn be real 


i=l i=] i=l 


Ifa; #0 for some j, then 


(5) (Le) (Ee) 


if and only if there is a real number t such that b; = ta; for alli =1,...,n. 


Proof. Since 
O< > (ajb; _— a;b;)’ 
1<i<j<n 


> (a?bi — 2aja;bjb; + a5b?) 


l 


l<i<j<n 


Sa? 3; — On a;bi)’, 
j=l 


i=] i=] 


we have 
(Ze) =(24) (24) 
Moreover, | | | 
(<4) - (Ex) (Es) 
if and only if . oo 


ajb; = a;b; 


for alli + j. In this case, if a; +0 for some j, lett = b;/a;. Then 


b; 
b; = (2) a; = ta; 
aj 


fori = 1,...,n. This completes the proof. 
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Lemma 7.2 Let aj,...,Q, be positive real numbers and b,,..., b, be any real 
numbers. The minimum value of the quadratic form 


2 2 
O(Y1, +--+) Yn) = AY] +++ + Any, 
subject to the linear constraint 


bry +++: +bnyn =1 (7.1) 


n b2 —1 
n-(E2) 


and this value is attained if and only if 


for alli =1,...,n. 


Proof. Let y,,..., y, be real numbers that satisfy (7.1). By the Cauchy— 
Schwartz inequality, we have 


and so 
n n 2 1 
aw z (doa) =m 
i=] j=1 “i 
Moreover, 
n 
2 
i=] 
if and only if there exists a real number ¢ such that, for alli = 1,...,n, 


tb; 
Jai Yi = Vai 


or, equivalently, 
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This implies that 


and so 
t=m 
and 
mb; 
yi =—- 
aj 


Conversely, if y; = mb;/a; for all i, then )>;_, b:y; = 1 and Q(y1,.-., yn) = m. 
This completes the proof. 


Theorem 7.1 (Selberg sieve) Let A be a finite sequence of integers, and let |A| 
denote the number of terms of the sequence. Let P be a set of primes. For any real 


number z > 2, let 
P(z)=[ |p. 


p<z 
peP 


The “sieving function” 
S(A, P, Z) 


denotes the number of terms of the sequence A that are not divisible by any prime 
p € P such that p < z. For every square-free positive integer d, let |Aq| denote 
the number of terms of the sequence A that are divisible by d. Let g(k) be a 
multiplicative function such that 


0 < g(p) <1 forall p éP, 


and let g\(m) be a completely multiplicative function such that g\(p) = g(p) for 
all p € P. Define the “remainder term” r(d) and the function G(z) by 


r(d) = |Aal — g(@)IAl 


and 
G(z)= >> gin). 
pim->peP 
Then Al 
S(A, P,z) < ——- + 9 3° Ir(d)I, 7.2 
( ) G@* 2 Ir(d)| (7.2) 
d\P(z) 


where w(d) is the number of distinct prime divisors of d. 


Proof. Since g is a multiplicative function, we have, by Theorem A.7, 


8([d1, d2])g((di, d2)) = g(di)g(d2) 


for all positive integers d, and dp. 
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Let z > 2. For every divisor d of P(z), we shall choose a real number A(d) 
subject only to the conditions that 


A(1) = 1 
and 
A(d)=0 forall d>z. 
Since 


( > uo) 20 


d|(a, P(z)) 
for all nonnegative integers a and 


2 
( > ) =]  if(a, P(z))=1, 


d\(a, P(z)) 


it follows that 


S(A,P,z)= >> 1 


aéA 
(a, P(z))=1 


2 
< »( > a) 
acA \d|{(a, P(z)) 
Yo >> DS AG)A@) 
acA 


dla dy\a 
d,|P(z) d2|P(z) 


Y> Adi)A(d2) DO 1 


dy ,dy|P(2) aoe 
= > A(d))A(d2)| A[a, a2] 
d\ ,d2|P(z) 
= > A(d;)A(d2) (9([d1, d2))|A| + r([d1, d2])) 
d\ ,d2|P(z) 
=|Al >> g(ldi, dada) + D> AL)ACG2)r (Lai, do) 
d ,d2|P(z) d) ,d| P(z) 
1 
=A 9d, )M(d;) g(dy) Md 
|A| >» a((d.dy) 1 )A(d1)g(d2)A(d2) 
dy d7|P(2) 
+ > Md) )A(d2)r([d;, d2]) 
ty ag? 
= |A|O+R, 


where 


1 
= —_—_ 9(d,)A(d,)e(d>)x 
Q >» CH J 8 dldiB (da)Maa) 


dy ,dz|P(z) 
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and 
R= > Ad)A(d2)r (di, a2). 
ay igi) 
Let D be the set of all positive divisors of P(z) that are strictly less than z, that 
iS, 
D = {k|P(z):1<k < Zz}. 
Then D is a divisor-closed set of square-free integers. If k € D, then0 < g(k) < 1 


since 0 < g(p) < 1 forall primes p € P. Fork € D, we define the function f{(k) 
by 


wd) 1 1 
k = = — d d = 1 — ° 7.3 
f(k) D ek/d) 0H DM )g(d) lI g(p)) (7.3) 


Then f(k) > O and f(kik2) = f(ki) f (ko) if ki,ko € D and (ki, kz) = 1. By 
Mobius inversion (Theorem A.19), we have 


1 
—_ = d). 7.4 
XS ) (7.4) 
Then 
] 
= ———— g(d))A(d))2(d>)X(d. 
Q p> a(dirdy 8 AMA) 8 (ca) Maa) 
= > >, Ff (K)g (4) )A(d1)g(d2)A(d2) 
d,,d,ED tlds 
= DI F®) DE sdiaig@)a(ar) 
keD d,,d,3ED 
kldy ,kld> 
2 
= f® (x sn 
keD “a. 
=) fy, 
keD 
where 


Ye =) g(d)A(d). 


deD 
kid 


Thus, Q is a quadratic form in the variables y,;. 
The set D is finite and divisor-closed. By Mobius inversion (Theorem A.22), 
we have 


k 
g(d)(d)= Dou (=) Ye = wd) ) wk) ye. (7.5) 


keD keD 
dk dk 
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In particular, for d = 1 we obtain 


> Mey = 1. 


keD 
We define | 
F(z) = —., 
(Z) > F® 


kEeD 


By Lemma 7.2, the minimum value of the quadratic form 


O=)  f®y 


keED 


subject to the linear constraint (7.6) is 


sway (ye 
tep Jk) top JAK) F(z)’ 


and this minimum is attained when 


7 _ hE) 
«FF (kK) 
We insert these values of y, into (7.5) to compute A(d) as follows: 
u(d) 
A(d) = —— k 
=" DH yx 
d\k 
_ w@) 
- -@ » (de) yae 
(d) (dé) 
= det) | ———— 
ad) 2 ance, 
de|P(z) 
~_ #@M yh 
f(d)g(d)F(z) & f(é) 
de|P(z) 
__@)Falz) 
f (d)g(d) F(z)’ 


where 


] 
F(Z) = > F(e)’ 


t<z/d 
de|P(z) 


183 


(7.6) 


In the preceding calculation, we used the fact that if dé divides P(z), then d and £ 
are relatively prime since P(z) is square-free. We shall use this fact again to prove 
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that |A(d)| < 1. Let d be any positive divisor of P(z). Then 


1 
FO= sp 


kED 
1 


Lld  keD fk) 
(kd) 
1 
mae J (€m) 


e 
m| P(z) 
(€m,d)=€ 


L\d 


Yah az 

L\d f() m<z/£ f(m) 
€m|P(z) 
(m,d/)=| 


ty 
c+ fe) K fim) 


dm|P(z) 


1 1 
| dm|P(z) 


] 
= Fy(z) > 7® 


L\d 
F(z) 
d/e 
4@ dS /t) 
Fq(Z) 


~ F(d)g(d) 


by (7.4), and so 

Fy(2) <1. 
Ff (d)g(d) F(z) 
By Exercise 1, for any square-free integer d there are exactly 3° ordered pairs of 
positive integers d,, d2 such that [d,, d)] = d.If d,, d) < z, thend = [d,, d)] < 22. 
If d; and d2 divide P(z), then d = [d,, d2] is a square-free number that also divides 
P(z). Therefore, 


|A(d)| = 


IRI =| Y> A(d)AC2)r([dh, d2]) 
én 
< > Ira, d))| 


d) dz <z 
d) ,d4|P(z) 
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< > 3° Ir(@)|, 


d|P(z) 


and so 
| A| d 
S(A, P, z) < —— + 32D > 1, 
(A, P.2) 5 Hoo + D3 Ira 


d<z2 
d|P(z) 


To obtain the upper bound (7.2) for the sieving function S(A, P, z), itis enough 
to prove that F(z) > G(z). Let g;(k) be a completely multiplicative function such 
that 


£1(p) = g(p) for all primes p € P. 


By (7.3), 
| 1 
we 7 keD f®) 
=> sk] [a-s(p)" 
kED p\k 
=) aie) | ]a-sp)" 
kED p\k 
=> g®][> leo. 
keED pik r=0 
=a] [>> a0’) 
keED pik r=0 
= Soak) Y> a2) 
neu Wiese 
=> Dd) ailke) 
cee Hell 
=> dS alm) 
keD mel 
P\(m/k)=> pik 
=Yogim)}| Yo 1 
ee suet sp 
2 2 81(m) >; 1 
pim=> peP Kim 


k\m 
pim/k=> pik 
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> >) gilm) 


M<Z 
p\|m=>peP 


= G(z), 


since, in the last inner sum, we can always choose k to be the “square-free kernel” 
of m, that is, the product of the distinct primes dividing m. This completes the 
proof of the theorem. 


7.3 Applications of the sieve 


In this section, we shall obtain an upper bound for the number of representations 
of an even integer as the sum of two primes. We also derive an upper bound for the 
number of representations of an even integer N as the difference of two primes, 
that is, an upper bound for the number of primes p < x such that p + N is also 
prime. 


Theorem 7.2 Let N be an even integer, and let r(N) denote the number of 
representations of N as the sum of two primes. Then 


r(N) < ———— (log aa I] (14 -). 


where the implied constant is absolute. 


Proof. The representation function r(N) counts the number of primes p < N 
such that N — p is also prime. Let 


an =n(N —n). 
Then 
A= {an}, 


is a finite sequence of integers with |A| = N terms. Let P be the set of all prime 


numbers. Let 
2<z< JN. 


The sieving function S(A, P, z) denotes the number of terms of the sequence A 
that are divisible by no prime p < z. If 


JN <n<N-WVN, 


andifa, =O (mod p)forsome prime p < z, theneithern or N —n is composite. 
This implies that 
r(N) < 2VN + S(A, P, 2). (7.7) 


We shall use the Selberg sieve to obtain an upper bound for S(A, P, z). We continue 
to use the notation of Theorem 7.1. 
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Let g(m) be the completely multiplicative function defined by 


_ J 2/p if p does not divide N 
g(p) = | \/p __ if p divides N. (7.8) 


Then g;(m) = g(m) for all m. Since N is even, 2 divides N and 


0 < g(p) <1 
for all primes p. Also, 
a, =n(N —n)=0 (mod p) 


if and only if 
n=Q (mod p) or n=N (mod p). 


If p does not divide N, then N #0 (mod p) and these two congruences are 
distinct. If p divides N, then N =O (mod p) and these two congruences are the 
same. Let 


d = Py-+* Peqi-**4e 


be a square-free integer, where the primes p; divide N and the primes q; do not 
divide NV. Then 
ae 

g(d) = 7" 
Sincea, =0 (mod d)ifandonlyifa, =0 (mod p)forevery prime p dividing 
d, it follows from the Chinese remainder theorem that there are exactly 2° pairwise 
distinct congruence classes modulo d such thata, =O (mod d) if and only ifn 
belongs to one of these 2° classes. Therefore, 


|Aa| = |Alg(d) +r(@), 


where 
Ir(d)| < 2° <2. (7.9) 
By the Selberg sieve, 
S(A,P, 2) < Fok + Jo 3° rca, 
G(z) d<z2 
d|P(z) 
where 
G(z)= > gm) 
M<Zz 


and w(d) is the number of distinct prime divisors of d. Let 
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where the primes p; divide N and the primes q; do not divide N. Then 


k ri €. Sj Sy tee +S¢ 
«mT (;) (=) == m 
i=l j=l 


Pi qj 


Let dy(m) denote the number of positive divisors of m that are relatively prime to 
N. Then 


e 
dy(m)=d (I]3') - lo, +1)< [2° we DS1t HS 


j=l j=l 
Therefore, 
g(m) = 
m 
and so dy(m) 
m 
G@) = ) am) > YI. 
m<Z mMm<Z 
Since 
1\' BQ 
ne-s) £2 
Pp t=] t 


it follows that 


p|N M<z t=l 
pit=>piN 
= Davin 
m<z t=] 
pit=>piN 
oS 1 
=didvim) Doe 
m<Z w=] WwW 


m|w 
p\(w/m)=> p|N 


| 
=D = DL ann) 
w=] mez 
pl(w/m)a>p1N 
>= ave 
mse pitw/ mae pI 
Let 
k l 
w= | [i | [@;’ 
J=1 
and 
k e 


m=] [or] 197. 


i=] j=l 
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where the primes p; divide N and the primes qg; do not divide N. Since m divides 
w, it follows that 0 < r; < u; for alli,O <s; < v; forall j, and 


k £ 
_ iT vj—Sj 
=[ er" Ta. 


i=] j=l 


S|€ 


Since every prime divisor of w/m divides N, it follows that no prime gq; divides 
w/m, and so s; = v; for all 7. Therefore, 


k e 
m=] xi] ]¢ 
i=} j=l 


and 


£ 
dy(m) = | [(v; + 1). 


j=l 
For each integer w, the number of such divisors m is 
i 
] [@+. 
i=l 
It follows that for every positive integer w < z, we have 


dX dy(m) = » []os+0- Tu +nT To +0- d(w), 


j=l i=] j=l 
p\tw/ mee pl pltw/mes pin 


where the divisor function d(w) counts the number of all positive divisors of w. 


Let 
z= N18. 


From Theorem A.13 we obtain 


I] (1 — 1\" G(z) > ~~ ds (log z)’ > (log N)’. 


p|N w<z 


Equivalently, 
|A| N 1\7! 
G@ “dognpll\'~ > 
(2) (log N)* 5 p 
N 
dognp LI - 
~ (log vy? 4 


1 

P? 

« _* ( + | 
(log N) | P 
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since the infinite product ITpu2 @ — p~’) converges. 
To find an upper bound for the remainder, we use (7.9) to obtain 


R= » 3°) ir(d)| < < > 30(2)40(d) < > 6°) 


d<z2 d<z2 d<z2 
d|P(z) d| P(z) 
Since 
2 <d 
and 


6°) (20080 log2 '086/log2 _ ,2log6/log2 
it follows that 


R < . "7 1086/ log? < 72t2 log 6/ log 2 < zi — N2/10 


d<z? 


since z = N!/8, Then 


S(A.P.2) & Gy] (145) #7" < wen lI (1+ >) 


env] ('+5) 


Theorem 7.3 Let N be a positive even integer, and let my (x) denote the number 
of primes p up to x such that p + N is also prime. Then 


0) < aoe l | (+ >) 


p\|N 


and so 
r(N) < 2V7N + S(A, P, z) «{ 


This completes the proof. 


where the implied constant is absolute. 
Proof. The proof is similar to the proof of Theorem 7.2. It starts as follows. Let 
={a,:l1<n<x} 
be the finite sequence of integers 
Aa, =n(n+N). 
Then |A| = [x]. Let P be the set of all prime numbers. For any z satisfying 
2<z< vx, 


we let S(A, P, z) denote the number of terms of the sequence A that are divisible 
by no prime p < z. If 
n> JX 
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and a, =0 (mod p) for some prime p < z, then either n orn +N is composite. 
This implies that 
n(x) < Vx + S(A, P, 2). 


We again use the Selberg sieve to obtain an upper bound for S(A, ?, z). Let 
d = pPi-+> Peqi-+-Qe 


be a square-free integer, where the primes p; divide N and the primes q; do not 
divide N. Let |A,| denote the number of terms of the sequence A that are divisible 
by d. For every square-free integer d, 


|A| 
|Aa| = —~ +r), 


g(d) 


where g(d) is the completely multiplicative function defined by (7.8), and 


Ir(d)| < 2° < 2°. 


Then Ay 
S(A,P,2)< =< +) 3°™Ir@, 
G(z) dX 
d|P(z) 
where 1 
G(z) = ——, 
Ac, &(m) 


The proof continues exactly as above. 
In the case where N = 2, we obtain the following improvement of Brun’s 
Theorem 6.11. 


Theorem 7.4 Let 12(x) denote the number of twin primes up to x. Then 


x 
M(x) << (ogx)? 


7.4 Shnirel’man density 


Let A be a set of integers. For any real number x, let A(x) denote the number of 
positive elements of A not exceeding x, that is, 


Az)= Do 1. 


acA 
Il<a<x 


The function A(x) is called the counting function of the set A. For x > 0 we have 


0< Ax) < [x] <x 
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and so 


A 
o(A) ~ nett o 
Clearly, 
0 < o(A) <1 


for every set A of integers. If (A) = a, then 
A(n) > an 


for alln = 1,2,3,....If 1 ¢ A, then A(1) = 0 and so o(A) = 0. 
If A contains every positive integer, then A(n) = n foralln > 1 andsoo(A) = 1. 
If m ¢ A for some m > 1, then A(m) < m — 1 and 


A 1 
o(A) < A 21-3 et. 
m m 


Thus, o(A) = 1 if and only if A contains every positive integer. 

If A and B are sets of integers, the sumset A + B is the set consisting of all 
integers of the form a + b, where a € A andb € B.If Aj,..., A, are h sets of 
integers, then 

A; t+ Ag+---+Ay 


denotes the set of all integers of the form a; + az +---+a,, where a; € A; for 
i=1,2,...,h.1f A; = A fori =1,2,...,h, we let 


hA=A+t+-:--:+A. 
een 


h times 


The set A is called a basis of order h if hA contains every nonnegative integer, that 
is, if every nonnegative integer can be represented as the sum of h not necessarily 
distinct elements of A. The set A is called a basis of finite order if A is a basis of 
order h for some h > 1. 

Shnirel’man density is an important additive measure of the size of a set of 
integers. In particular, the set A is a basis of order h if and only if o(hA) = 1, and 
the set A is a basis of finite order if and only if o(hA) = 1 for some h > 1. 

Shnirel’man made the simple but extraordinarily powerful discovery that if A 
is a set of integers that contains 0 and has positive Shnirel’man density, then A is 
a basis of finite order. 


Lemma 7.3 Let A and B be sets of integers such thatO € A,0 € B.Ifn > Oand 
A(n)+ B(n) >n, thenne A+B. 
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Proof. If n € A, thenn =n +0 e€ A+B. Similarly, ifn € B, thenn =O+ne 
A+B. 
Suppose that n ¢ A U B. Define sets A’ and B’ by 


A’={n—a:aée€A,l<a<n-—1} 


and 
B’={b:b€ B,1<b<n-—}}. 


Then |A’| = A() since n ¢ A, and |B’| = B(n) since n ¢ B. Moreover, 
A’ UB’ C[l1,n— 1]. 


Since 
|A’| + |B’| = A(n) + B(n) > n, 


it follows that 
A’ B’ ¥@. 


Therefore, n —a =bforsomea € Aandbe B,andson=a+bDEA+B. 


Lemma 7.4 Let A and B be sets of integers such thatO € A and0O ¢€ B. If 
a(A)+oa(B) => 1, thenn € A+B for every nonnegative integer n. 


Proof. Let 0(A) = a and o(B) = B. If n > 0, then 
A(n) + B(n) = (a+ B)n =n, 
and Lemma 7.3 implies thatn € A+B. 


Lemma 7.5 Let A be a set of integers such that0 € A andoa(A) > 1/2. Then A 
is a basis of order 2. 


Proof. This follows immediately from Lemma 7.4 with A = B. 


Theorem 7.5 (Shnirel’man) Let A and B be sets of integers such that 0 € A and 
0 € B. Let ao(A) =a and a(B) = B. Then 


o(A+B)>a+B —afB. (7.10) 
Proof. Let n > 1. Let dp = 0 and let 
l<a,;<-::-<a<n 


be the k = A(n) positive elements of A that do not exceed n. Since 0 € B, it 
follows that a; =a; +9 € A+B fori =1,...,k.Fori =0,...,k —1, let 


1<b, <-:-- <b, < Qj4; —aj —1 
be the r; = B(a;4; — a; — 1) positive elements of B less than a;,,; — a;. Then 


ai <ajtd, <+-- <a tb, < ais; 
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and 
a; +b; EA+B 


for j =1,...,7;. Let 
1<b <---<b, <n-a 
be the 7, = B(n — a,x) positive elements of B not exceeding n — a,. Then 
a <artby <---<a,+b, <n 
and 
a,+b; Ee A+B 
for j =1,..., 7%. It follows that 


k—1 
(A+ B)(n) > A(n) +) B(giss — a; — 1) + BQ — a) 
i=Q 


k-1 
> A(n) +B > (Gis1 — a; — 1) + B(n — ay) 
ixQ 


k-1 
= A(n) + BY (aisi — a;) + B(n — ay) — Bk 
i= 


= A(n)+ Bn — Bk 

= A(n) + Bn — BA(n) 
= (1 — B)A(n) + Bn 
> (1 — B)an+ Bn 


= (a +B —aB)n 
and so (A + BY(n) 
cake ena >a+B —afB. 
Therefore, 
o(At+ ay oo >at+B—aB. 


This completes the proof. 
Inequality (7.10) can be expressed as follows: 


1—o(A+B) < (1 —o(A))(1 —o(B)). (7.11) 


The following theorem generalizes this inequality to the sum of any finite number 
of sets of integers. 


Theorem 7.6 Leth > 1, and let A,,..., An be sets of integers such that 0 € A; 
fori =1,...,h. Then 


h 
1 —o(Aj +---+An) < |] -0(A))). 


i=] 
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Proof. This is by induction on h. Let o(A;) = a; fori=1,...,h. Forh = 1, 
there is nothing to prove, and for h = 2 it is inequality (7.11). 

Let h > 3, and assume that the theorem holds for h — 1. Let A;,..., A, beh 
sets of integers such that 0 € A; for alli. Let B = Az +---+ Aj. It follows from 
the induction hypothesis that 


h 
1 —0(B)=1~-o0(Ag+---+ An) < [ [—0(Ad), 
i=2 


and so 


1—oa(A, +---+A;z) = 1 —o (A; + B) 
< (1 — o(A1))01 — o(B)) 


h 
< (1 —0(A;)) | [GQ - o(A))) 


i=2 
h 
= | Ja -o(4)). 
i=] 
This completes the proof. 


Theorem 7.7 (Shnirel’man) Let A be a set of integers such that 0 € A and 
a(A) > 0. Then A is a basis of finite order. 


Proof. Let 0(A) = a@ > 0. Then 0 < 1 —a@ < 1, and so 
O<(1-a) <1/2 
for some integer £ > 1. By Theorem 7.6, 
1 —a(€A) < (1—o(A))' =(1— a) < 1/2, 


and so 
a(€A) => 1/2. 


Let h = 22. It follows from Lemma 7.5 that the set 2A is a basis of order 2, and so 
A is a basis of order 2£ = h. This completes the proof. 
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We shall apply Shnirel’man’s criterion for a set of integers to be a basis of finite 
order to prove that every integer greater than one is a sum of a bounded number of 
primes. We begin by proving that the set consisting of 0, 1, and the numbers that 
can be represented as the sum of two primes has positive Shnirel’man density. To 
do this, we need estimates for the average number of representations of an integer 
as the sum of two primes. 
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Lemma 7.6 Let r(N) denote the number of representations of the integer N as 
the sum of two primes. Then 

x2 


> "\N) > Goo (log x)2 


N<x 


Proof. If p and g are primes such that p,q < x/2, then p+q < x. Therefore, 


; (x /2)? x? 
dX r(N) > 2(x/2)* > (log(x/2))2 7d) (log x)? 


by Chebyshev’s theorem (Theorem 6.3). 


Lemma 7.7 Let r(N) denote the number of representations of N as the sum of 
two primes. Then 


N 
Ly « Togs 


Proof. By Theorem 7.2, if N is even, then 


rN) & Goo any | l] I(1+ ~)s doe dog ar ~. 


This inequality also holds for odd integers, since an odd integer N can be written 
as the sum of two primes if and only if N — 2 is prime, in which case r(N) = 2. 
In the following calculation, we use the fact that 


> (dydz)'””. 


5) 
d,,d>|= > 
[d,, dz] (d,, ds) 


Then 


vr E(B 
= Og (log x)4 y > 


Nx d\|N d2|N 


x? 1 
< —___ ) ) 1 
(log.x)* |, “Trex dyd> — 
d,|N,da|N 


_ x? 1 1 
~ (log x)4 De, did, > 


dy< Nex 
{a1 .d2]|N 


7.5. The Shnirel’man—Goldbach theorem 197 


x? 1 x 


< __ — 
7 woes) a, dex 21% [d1, a2] 


< ta De, age d;! gm 


This completes the proof. 


Theorem 7.8 The set 
={0,1}U{p+q: p,q _ primes} 
has positive Shnirel’man density. 


Proof. Let r(N) denote the number of representations of N as the sum of two 
primes. By the Cauchy—Schwarz inequality, we have 


2 
(= nw) < > 1 > r(N)? < A(x) > r(NY. 


N<x N&<x N<x N<x 
r(N)>1 ~~ 


By Lemma 7.6 and Lemma 7.7, 


A(x) . 1 (Sweet) 
x — x Yive,r(N) 


(log x)* 
> 1. 


This means that there exists a number c; > O such that A(x) > c,x forall x => Xo. 
Since 1 belongs to the set A, it follows that there exists a number cz > 0 such that 
A(x) > cox for 1 < x < Xo. Therefore, A(x) > min(c;, c2)x for all x > 1, and so 
the Shnirel’man density of A is positive. This completes the proof. 


Theorem 7.9 (Goldbach—Shnirel’man) Every integer greater than one is the sum 
of a bounded number of primes. 


Proof. We have shown that the set 


={0,1}U{p+q: p,q _ primes} 
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has positive Shnirel’man density. By Theorem 7.7, there exists an integer h such 
that every nonnegative integer is the sum of exactly h elements of A. Let N > 2. 
Then N — 2 > 0, so for some integers k and £ with k + 2 < h there exist £ pairs 
of primes p;, g; such that 


N-~2=1+---+1+(pi+gi) +--+ (pet qu). 
k 
Let k = 2m +r, where r = 0 or 1. If r = 0, then 


N=2+4+---+24+(pit+qi)t+-:>+(pet+qe). 


m+1 


If r = 1, then 
N=2+---+243+(pitqi)t::-+(petqe). 
m 


In both cases, N is a sum of 
2l+m+1 < 3h 


primes. This completes the proof. 


Theorem 7.10 Let Q be a set of primes that contains a positive proportion of the 
primes, that is, 

O(x) > Om(x) 
for some 0 > 0 and all sufficiently large x. Then every sufficiently large integer is 


the sum of a bounded number of primes belonging to QO. 


Proof. We shall first show that the set 
A(Q) = {0, 1} U{p+q: p,q € Q} 


has positive Shnirel’man density. Let r(N) denote the number of representations 
of N as the sum of two primes, and let rg(N) denote the number of representations 
of N as the sum of two primes belonging to Q. Then 


x2 


dTo(N) = (Qx/2))? > (Ox(x))? > Toe?’ 


N<x 


By Lemma 7.7, 
3 
Xx 
roy < Dory « —, 


N<x N<x (log x)4" 


It follows exactly as in the proof of Theorem 7.8 that the set A(Q) has positive 
Shnirel’man density. Therefore, A(Q) is a basis of finite order. It follows that there 
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exists a number h, such that every nonnegative integer is the sum of h, elements 
of OU {0, 1}. 

Choose two primes p), p2 € Q. By Exercise 3, there exists an integer no = 
No(P1, P2) such that every integer n > no can be written in the form 


n = £,(n)p; + £2(n)po, 
where €;(n) and £2(n) are nonnegative integers. Let 
hz = max{€;(n) + £2(n):n =no,...,no thy}, 
and let 
h=h,+hy. 


If N > no, then N — no can be written as the sum of at most h; elements of OU {1}, 
that is, 
N—no=1+---+1+p;j, +--+ Dis 
k 
where 
k+£ <hy,. 
Then 
no +k = £,(n)p; + £2(n)p2, 


where £;(n) + £2(n) < h2, and so 
N =no+k + pi, +--- Dix 
= £;(n)p; + £2(1) po + Di, +++ Dix 


is a sum of 
£+0,(n)+22.(n) < hy +h2.=h 


primes belonging to the set Q. This completes the proof. 


7.6 Romanov’s theorem 


Let a be an integer, a > 2. We investigate how many numbers N up to x can be 
written in the form 
N = p+a', (7.12) 


where p is a prime and k is a positive integer. Let r(N) be the number of repre- 
sentations of N in this form. Since the number of positive powers of a up to x is 
< log x and the number of primes up to x is r(x) « x/ log x, it follows that 


> r(N) = l{p+a* <x} < log ( Jes 
log x 


N<x 
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Let 
A={p+a*: p prime andk > 1}, 


and let A(x) be the counting function of the set A. In this section, we shall prove 
a remarkable theorem of Romanov that the lower asymptotic density of the set A 
1S poSitive, that is, there exists a constant c > 0 such that 


A(x) > cx 


for all sufficiently large x. This means that a positive proportion of the natural 
numbers can be represented in the form (7.12). 


Lemma 7.8 Let a be an integer, a > 2. For every integer d > 1 such that 
(a, d) = 1, let e(d) denote the exponent of a modulo d, that is, the smallest integer 
such that 


Then the series 


Me 


u2(d)=l 


converges. 
Proof. If (a, d) = 1 and e(d) = k, then 
a‘ =1 (mod d), 


and so d divides a* — 1. Since a* — 1 has only finitely many divisors, it follows 
that there are only finitely many numbers d such that e(d) = k. For x > 2, let 


D = D(x)=| | @‘ - 1), 
k<x 
and let n = w(D) be the number of distinct prime divisors of D. Let 


E(x)=)> > 


k<x = e(d)=k 
_ (a,d)=} 
p2(d)=1 


The number d appears in this double sum at most once, and if d appears, then d 
divides a* — 1 for some k < x, sod divides D. It follows that 


E(x)< > 7-T(+2) <T(+=). 


d|D p|D i=] Di 
pr(d)=l 
where Pp}, P2,.-., Pn are the first n prime numbers. Since 


2" = 2%) < D = [| _ 1) < | [a < gttD/2 < a’, 


k<x k<x 
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loga\ 4 , 
< (PES) KX. 


log pn K logn < logx, 


it follows that 


By Chebyshev (Theorem 6.4), 


and so, by Mertens’s formula (Theorem 6.8), 


E(x) « |] (1+<) 


P=Pn 


< log x. 


By partial summation, 


1 1 E(x) [ E(t) 
- -| = +) ar 
d k » d x 1 t2 


(a,d)=1 
p2(d)=l 


] * logt 
<K os* + | —= dt 
XxX 1 t 


< |, 


and so the series 


Ms 
a le 
Q| 

l 
S|. 
& 


~ 
Hl 
a 
ms 
Q 
T 
~~ 


(a,d)=1 (a,d)=1 
p2(d)=1 u2(d)=1 


converges. This completes the proof. 
Lemma 7.9 Let a be an integer,a > 2, and let r(N) denote the number of 


solutions of the equation 
N=pt a‘, 


where p is a prime and k is a positive integer. Then 


Yor(NY «x. 


N<x 


Proof. Since r(N)* is equal to the number of quadruples (pi, p2, ki, k2) such 
that 
Pr +q" = P2 +q? = N, 
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it follows that >> N<x r(N)* is equal to the number of quadruples (p;, p2, k1, k2) 
such that 
P1 +q" = P2 + qh? < Xx. 


This does not exceed the number of solutions of the equation 


p2— pi =a"! —a® 


with Pi, P2=éKx and ki, ky < logx/loga. 
Choose positive integers k, + k2, and let 


h=a™ —a®. 
Then h is a nonzero, even integer. The number of solutions of the equation 


Po — pp =a"! —a"® =h 


with p;, p2 < x is at most the number of primes p; < x such that p; +h is also 
prime. By Theorem 7.3, this is 


THO) S ogy does? I] (+ >) 


h =a"! (a2 — 1) 


If ky > kj, then 


and 


- I] 
(0 Pll) 


« |] (1+2), 
pi(ae-h-1) SP 


where the implied constant depends on a. Similarly, if k,; > k2, then 


h = —a" (a — 1) 


7) TL Cr) I, Gs) 
l+-]< l+—]= I] 1+—]. 
I ( P wey PY pi(al-niay SP 


Finally, if kz = k,, the number of solutions of the equation 


and 


Po — p, = a® — a" = 0 


7.6 Romanov’s theorem 


with pi, po < x and 1 < kp < logx/loga is 
(x) log x 
loga 
It follows that 


r(NyY <x+2 iat 
p\(a'2-*1 -1) P 


Nx logx 
= 1<k, <k2 <j lopa 


«K x+logx > I] (1+=) 


Isk< an 


«Kx +logx > 5 


I<k< 24 d\(a k “iy 4 
p(d)2=1 


To estimate the last term, we observe that 


d| (a* — 1) 
if and only if 
a* =1 (mod d) 
if and only if 
e(d)|k 
Then 
> r(NyY <x + log x > 5 
Nx I<k< p24 ai(e k “y 4 
p2(d)=) 
‘a. d)=1 
= x+logx » 5 > 1 
ean 4 kx ge 
“ ae 
= x+logx 5 d 1 
p2(d)=1 k< log x 
(ad) ody 
log x 
<x+logx > 
Oe de(d)loga 
(a,d)=1 
<K x + (log x) ————_ 
oon de(d) 
(a,d)=1 
KX 


since the infinite series converges by Lemma 7.8. 
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Lemma 7.10 Leta be an integer, a > 2, and let r(N) denote the number of 
solutions of the equation 
N = p+a', 


where p is a prime and k is a positive integer. Then 


Yo r(N) > x. 


N<x 
Proof. If p < x/2 anda* < x/2, then p+a* < x, so 


YN) > 1 (x/2) log(x/2) > x. 


N<x 
This completes the proof. 


Theorem 7.11 (Romanov) Leta be an integer, a > 2. Let 
A={p+ta* : p prime andk > 1}, 


and let A(x) be the counting function of the set A. There exists a constant c > 0 
such that 
A(x) > cx 


for all sufficiently large x. 


Proof. We use the Cauchy—Schwarz inequality. By Lemma 7.10 and Lemma 7.9, 
there exist positive numbers c,; and c2 such that, for x sufficiently large, 


2 
(xy < (Za) 


N<x 


< A(x) ) (NY < cox A(x) 


N<x 


and so 
A(x) > cx. 


7.7 Covering congruences 


Choosing a = 2 in Romanov’s theorem, we see that a positive proportion of the 
natural numbers can be written in the form p + 2*. The only even numbers of this 
form are 2 + 2*, and they constitute a very sparse subset of the even integers, a 
subset of density zero, so almost all of the integers of the form p + 2* are odd. 
We shall prove that there exists an infinite arithmetic progression of odd natural 
numbers, none of which can be written in the form p +2*. To do this, we introduce 
the concept of covering congruences for the integers. 
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Let 


1<m, <m2<---< me 


be a strictly increasing finite sequence of integers, and let a), ..., a¢ be any in- 
tegers. Then the @ congruence classes a; (mod m;) form a system of covering 
congruences if, for every integer k, there exists at least one i such that 


k=a; (mod mj). (7.13) 


This means that the congruence classes a; (mod m;) cover the integers in the 
sense that 


e 
Z=|J{keZ:k=a; (mod mj}. 
i=] 
It is an essential part of the definition of covering congruences that the moduli 


m; are pairwise distinct integers greater than one. Here is a simple example of a 
system of covering congruences. 


Lemma 7.11 The six congruences 


0 (mod 2) 
0 (mod 3) 
1 (mod 4) 
3 (mod 8) 
7 (mod 12) 
23 (mod 24) 


form a set of covering congruences. 


Proof. First, we show that each of the 24 integers 0, 1, ... , 23 satisfies at least 
one of these six congruences. Every even integer k satisfies k =O (mod 2). For 
odd integers, we have 


(mod 4) 
(mod 3) 
(mod 4) 
(mod 12) 
(mod 3) 
(mod 8) 
(mod 4) 
(mod 3) 
(mod 4) 
(mod 12) 
(mod 3) 
23 (mod 24). 


— p— 
Il Il os All 
on - OF KH WD ON =| CO -— 


— 
“J On WH —- oO JD) NA DH — 
lil 


NO NHN - 
WO — WO 
Il lI 
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For every integer k, there is a unique integer r € {0, 1, ..., 23} such that 
k=r_ (mod 24). 


Choose / so that 
rE@Q; (mod m;), 


where a; (mod m;) is one of our six congruences. Each of the six moduli 2, 3, 
4,6, 12, and 24 divides 24, so m; divides 24 and 


k=r_ (mod ™m)). 


Therefore, 


This completes the proof. 


Theorem 7.12 (Erdos) There exists an infinite arithmetic progression of odd 
positive integers, none of which is of the form p +2*. 


Proof. We shall use the system of covering congruences a; (mod m;) con- 
structed in Lemma 7.11. For each of the six moduli m; in this system, we choose 
distinct primes p; such that 


2° =1 (mod p,), 


as follows: 
27=1 (mod 3) 
2>=1 (mod 7) 
2*=1 (mod 5) 
28 =1 (mod 17) 
2'2=1 (mod 13) 
274=1 (mod 241). 
Let 
£ = max{p;} = 241 
and 


m=2°.3-7-5-17- 13-241. 


By the Chinese remainder theorem, there exists a unique congruence class r 
(mod m) such thatr = 1 (mod 2°) andr = 2% (mod p;) fori = 1,...,6. 
This means that 


=1 (mod 2°) 


r=2° (mod 3) 
r=2° (mod 7) 
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r=2!' (mod 5) 
r=2> (mod 17) 
r=2’ (mod 13) 
r=2” (mod 241), 


where the exponents in the powers of 2 are the least nonnegative residues a; in 
the six congruence classes in the system of covering congruences. Since r is odd 
and the modulus m is even, it follows that every integer in the congruence class r 
(mod m) is odd. 

Let N be an integer in the congruence class r (mod m) such that 


N>2° +8. 


Let k be a positive integer such that 2 < N. There is a congruence class a; 
(mod m;) in the system of covering congruences such that 


k= Qj (mod m;) 
so k = a; +m;u; for some integer u;. Since 


2”: =1 (mod p), 


we have 
2k = 272% =2% (mod pi). 
Since 
N=r _ (mod p;) 
and 


r=2" (mod p;), 


it follows that 
N=r=2%=2' (mod pj), 


and so 
N = 2" + pyv 


for some positive integer v. If k < £, then 
piv =N —2* > N—2' > £=max{p;} > pi; 
fori=1,...,6,andsov > 1. Ifk > @, then 
N-—2'=N=1 (mod 2°) 


and so 
piv =N—2 =14+2'w>2°> le p; 


and v > 1. In both cases, N — 2* is composite. This completes the proof. 
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7.8 Notes 


Shnirel’man’s fundamental paper was published first in Russian [113] and then 
expanded and published in German [114]. By Shnirel’ man’s constant we mean the 
smallest number / such that every integer greater than one is the sum of at most 
h primes. Using the Brun sieve, Shnirel’man proved that this constant is finite. 
The best estimate for Shnirel’man’s constant is due to Ramaré [100], who has 
proved that every even integer is the sum of at most six primes. It follows that 
Shnirel’man’s constant is at most seven. The Goldbach conjecture implies that 
Shirel’man’s constant is three. 

In this chapter, I use the Selberg sieve instead of the Brun sieve to prove the 
Goldbach-Shnirel’man theorem. See Hua [63] for a nice account of this approach. 
Landau [76, 77] gives Shnirel’man’s original method. Theorem 7.10, the general- 
ization of the Goldbach-Shnirel’man theorem to dense subsets of the primes, is 
due to Nathanson [90]. 

Selberg introduced his sieve in a beautiful short paper [109]. I use Selberg’s 
original proof of the sieve inequality (7.2). See Selberg’s Collected Papers[{110, 
111] for his papers on sieve theory. Prachar [97] contains a nice exposition of the 
Selberg sieve, with many applications. The standard references on sieve methods 
are the monographs of Halberstam and Richert [44] and Motohashi [87]. 

Romanov’s theorem appears in the paper [103]. Romanov also proved that, for 
a fixed exponent k, the set of integers of the form p + n* has positive density. 
The proof of Theorem 7.8 of Romanov’s theorem was simplified by Erdos and 
Turan [30] and Erdos [33]. 

Erdos [32] invented covering congruences and used them to construct the infinite 
arithmetic progression of odd positive integers not of the form p+2*, as described 
in Theorem 7.12. Crocker [16] proved that there exists an infinite set of odd positive 
integers that cannot be represented as the sum of a prime and two positive powers 
of 2. Crocker’s set is sparse. It is an open problem to determine if there exists an 
infinite arithmetic progression of odd positive integers not of the form p+2*' +2”. 

There are many unsolved problems concerning covering congruences. It is not 
known, for example, whether there exists a system of covering congruences all of 
whose moduli are odd. Nor is it known whether, for any number M, there exists 
a system of covering congruences all of whose moduli are greater than M. The 
best result is due to Choi [12], who proved that there exists a system of covering 
congruences with smallest modulus 20. 


7.9 Exercises 


1. Prove that for any square-free integer d there are exactly 3° pairs of 
positive integers d,, dy such that [d), d2] = d. 
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. Let w(n) denote the number of distinct prime divisors of n. Let n > 2 and 


r > 0. Prove that 
>) u@<0< YO u@. 


d\n din 
w(d)<2r+l w(d)<2r 


. Let a; and az be relatively prime positive integers. Prove that there exists an 
integer no = No(a}, Az) such that every integer n > no can be written in the 
form 

n = £;(n)a, + £2(n)az 


for some nonnegative integers £;(n), 22(n). 


. Construct a system of covering congruences whose moduli are 2, 3, 4, 6, 
and 12. 


. Let us call an integer n exceptional if n — 2* is prime for all positive integers 
k < logn/log2. Find all exceptional numbers up to 105. Erdos [32] has 
written that “it seems likely that 105 is the largest exceptional integer.” 


. Let {a; (mod m;):i = 1,...,k} be a system of covering congruences. 
Prove that 


8 


Sums of three primes 


The method which I discovered in 1937 for estimating sums over 
primes permits, in the first instance, the evaluation of an estimate for 
the simplest of such sums, 1.e. a sum of the type: 


y } eetiap | 


PsN 


This estimate in combination with the previously known theorems 
concerning the distribution of primes in arithmetic progressions ... 
paved the way for establishing unconditionally the asymptotic for- 
mula of Hardy and Littlewood in the Goldbach ternary representation 
problem. 


I. M. Vinogradov [135, page 365] 


8.1 Vinogradov’s theorem 


Vinogradov proved that every sufficiently large odd integer is the sum of three 
primes. In addition, he obtained an asymptotic formula for the number of rep- 
resentations of an odd integer as the sum of three prime numbers. Vinogradov’s 
theorem is one of the great results in additive prime number theory. The princi- 
pal ingredients of the proof are the circle method and an estimate of a certain 
exponential sum over prime numbers. 
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The counting function for the number of representations of an odd integer N as 
the sum of three primes is 
r(N)= So 1. 


Pit+p2+p3=N 


The following is Vinogradov’s asymptotic formula for r(N). 


Theorem 8.1 (Vinogradov) There exists an arithmetic function G(N) and positive 
constants c, and cz such that 


Cc} < G(N) < co 
for all sufficiently large odd integers N, and 
N? log log N 
N) = G(N)———-, | 1 + O | ———_ ] } . 
FW) = SO) sas ( (Ace ) 


The arithmetic function G(N) is called the singular series for the ternary 
Goldbach problem. 


8.2 The singular series 


We begin by studying the arithmetic function 


SiN) = 5 Hea) (8.1) 


ma «Pay 


where 
q 


cq(N) = )— e(aN/q) 
qa! 
is Ramanujan’s sum (A.2). The function G(N) is called the singular series for the 
ternary Goldbach problem. 


Theorem 8.2. The singular series G(N) converges absolutely and uniformly in 
N and has the Euler product 


1 1 
G(N 1+ ] — ——_—_—_—___ } . 
= N(+g o—e)IT( apa) 
p|N 
There exist positive constants c, and cz such that 
Cc) < G(N) <c 
for all positive integers N. Moreover, for any € > 0, 
L(g )Cq (NV) 
9(q)° 


where the implied constant depends only on €. 


GIN, Q)=)> = 6(N)+0(Q-"-), (8.2) 


q<Q 
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Proof. Clearly, cg(N) « ¢(q). By Theorem A.16, 


yq)>qi* 
for e > O and all sufficiently large integers g, and so 


L(q)cqg(N) 1 
y(q)° oq > q?-* 


Thus, the singular series converges _ and =m in N. Moreover, 


6(N) - BW, 0) K > —; = say < Le 
q>o © 


By Theorem A.24, c,(N) is a multiplicative function of gq and 


—1 if p divides N 
CoN) = |? —] if p does not divide N. 
Since the arithmetic function 
L(q )Cq (NV) 
y(q)° 


is multiplicative in g and y(p’) = 0 for j > 2, it follows from Theorem A.28 that 
the singular series has the Euler product 


H(p! )cpi(N) 
=|] ( >> ~ gible 


Pp j=l 
-T] ( - | 
> y(p) 


1 1 
= 1+ ———— | —- ——_ 
( ‘oa)II( oi) 


1 1 
-]](1+—— | - ______ }, 
I( ‘oom IT 7a3p33) 


and so there exist positive constants c, and c2 such that 
Cc) < G(N) < co 


for all positive integers N. This completes the proof. 


8.3. Decomposition into major and minor arcs 


As in the proof of the Hardy—Littlewood asymptotic formula for Waring’s problem, 
we decompose the unit interval [0, 1] into two disjoint sets: the major arcs SM and 
the minor arcs m. 
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Let B > O and 
QO = (log N)®. (8.3) 
For 
l<q<Q, 
O<a<q, 
and 
(a,q) =1, 


the major arc SNt(q, a) is the interval consisting of all real numbers a € [0, 1] such 
that 
Q 


a—-—-|<—. 
q| N 


Ifa € Mg, a)N Mt(q’, a’) and a/q #a’/q’, then |ag’ — a’q| > 1 and 


a 


/ 
ee a ek LO 
Q qq’ qq’ q @q' 

/ 
2 
q q' N 


or, equivalently, 
N <2Q°? =2(log N)?4. 


This is impossible for N sufficiently large. Therefore, the major arcs IN(q, a) are 
pairwise disjoint for large N. The set of major arcs is 


Q 4 
m=) LU mq,a) < [0,1] 


q=l cael 
and the set of minor arcs is 
m = (0, 1] \ D0. 


We consider a weighted sum over the representations of N as a sum of three 
primes: 


RIN)= > log py log pp log ps. 
Pitp2+p3=N 


Vinogradov obtained an asymptotic formula for R(N), from which Theorem 8.1 
will follow by an elementary argument. We can use the circle method to express 
the representation function R(N) as the integral of a trigonometric polynomial 
over the major and minor arcs. Let 


F(a) = ) (log p)e(pa). (8.4) 


PSN 
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This exponential sum over primes is the generating function for R(N), and 


1 
RN)= > _ log pilog prlogps= | F(a)*e(—Na)da 
PitP2t+p3=N 0 


- f F(a)'e(—Na)do + i] F(a)%e(—Na)da. 
St m 


The main term in Vinogradov’s theorem will come from the integral over the major 
arcs, and the integral over the minor arcs will be negligible. 


8.4 The integral over the major arcs 


Just as in the Hardy—Littlewood asymptotic formula, the integral over the major 
arcs in Vinogradov’s theorem is (except for a small error term) the product of the 
singular series G(N) and an integral J(N). In this case, the integral J(N) is very 
easy to evaluate. 


Lemma 8.1 Let 


N 
u(B) = > e(mB). 
m=] 


Then 
1/2 N2 
I(N) = | _ MB) e(-NB)AB = 5 + OWN), 


Proof. By Theorem 5.1, the number of representations of N as the sum of three 
positive integers is 


1/2 
J(N) = / _ MBYe-NB Mp 


N 


1/2 N N 
- | YY eon + m2 + ms — N)B)AB 


This completes the proof. 

In the next lemma we shall apply the Siegel—-Walfisz theorem on the distri- 
bution of prime numbers in arithmetic progressions. A proof can be found in 
Davenport [19]. 
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Theorem 8.3 (Siegel—Walfisz) Ifq > 1 and (q,a) =1, then, for any C > 0, 


Xx x 
9(x3q,a) = dX logp- +0 (7) 


p=a (mod q) 


for all x > 2, where the implied constant depends only on C. 


Lemma 8.2 Let 
F, (or) = ) “(log p)e(par). 


psx 


Let B and C be positive real numbers. If 1 < q < Q = (log N)® and (q, a) = 1, 


then (a) ON 
HL 
Pel@lg) = Cay tO (ax) 


for 1 <x < N, where the implied constant depends only on B and C. 


Proof. Let p =r (mod q). Then p divides q if and only if (7, gq) > 1, and so 


» » (log p)e(pa/q) = ) (log p)e(pa/q) « ) log p < logg. 


cpr p=r Pimod q) an P\q 
Therefore, | 
q 
r(Z)-2 LD cosnre (7) 
r=l per’ (mod q) 


q 
-)> » (log pe (7 *) + O(oga) 


r=] 
(r,q)=1 p=r nod q) 


q 
-Ye(Z) YL doep)+ dog a» 


r=| psx 
(r.q)=1 p=r (mod q) 


=) (x; q,r) + O(log Q) 


q ra x x 
- Le (*) (a5 ro (a) OWE g) 


(r,q)=1 


_ Cq(a@) qx 
“G@*? (x) + Ollog @) 


“(q) ON 
“9@ 7° (<< we) , 


since, by Theorem A.24, c,(a) = (a) if (q, a) = 1. 
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Lemma 8.3 Let B and C be positive real numbers with C > 2B. Ifa € Dt(q, a) 
and B =a — a/q, then 


_ #@) Q*N 
P= Tg * o( Sax) 


3 L(q) 3 Q*N° ) 
Py = Gagnt +0 (Gene : 


where the implied constants depend only on B and C. 
Proof. If a € t(q, a), then a = a/q + B, where |B| < QO/N. Let 
dm) = log p_ if m= pis prime 


0 otherwise. 


If 1 <x < N, then 


F(a) - ae H(B)= tog pe(oa) - — DY eime) 


= 5  Mme(ma) — 4 ae Dy. e (mB) 


ent 


m=] 
N 
= do Me (™* + ") _ 3 7 ve (mB) 


= m=1 


N 
- (nome (“*) = me) e(mB). 
' q ~(q) 
By Lemma 8.2, we have 


a= F (ome () ea) 


1<m<x 


ma\ (9) ( ) 
x oO | — 
Dd, Meme (™ )- oq) \¢@) 


l<m<x 


= F, (2) - 228400) 


_o(_Qn_ 
“0 (oar): 


By partial summation, we obtain 


F(a) - we u(B) = A(N)e(NB) — 20:8 [ A(x)e(xB)dx 
< |A(N)| + |B|N max{A(x): 1 <x < N} 


Q?N 
K Joe NE" 
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Clearly, |u(B)| < N. Since C > 2B, we have 
Q°N N 


(log N)° (log N)C~28 ~~ 


and the estimate for F(a) follows immediately. This completes the proof. 


Theorem 8.4 For any positive numbers B,C, and € with C > 2B, the integral 
over the major arcs is 


3 N? N? N? 
he F(a) e(—Na)da = S(N)—- +O (cog aye +O (acme) ) 


where the implied constants depend only on B, C, and €. 


Proof. We note that the length of the major arc It(qg, a) is Q/N if g = 1 and 
2Q/N if g => 2. By Lemma 8.3, 


3 Bw) u( _ 2) _ 
hn i oq oq) a 7 e(—Na)da 
3 
= > | (- (a)? — Hq) u (a — “) ) e(—Na)da 
x ine Mig, a) g(q)° q 


Q?N? 
< [ —_—da 
EE be ae 


(a.q 


§ a72 
< LNT 
~ (log N)& 
N2 
< ——_—_——_., 


Ifa =a/q +B € Dt(q, a), then |B| < Q/N and 


q 3 
> Sos Hq) u (« _ =) e(—Na)da 
9(q)> J9n@,2) q 


qsQ 
(a,q)=1 

_ ys sy Ha) BQ) al _ 2) (Nada 
450 “nh 0(q)° Jajq—Q/N q 


(a) [- 3 
ye —N —NB)d 


(a,q)=1 
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-y Sie H(q)cq(—-N) N) 


aay / MBP —NB)dB 
q<Q 


= G(N, Q) [- u(B)°e(—NB)dB. 
—-Q/N 


By Lemma 4.7, if |B| < 1/2, then 


u(B) « |B\~' 
and 
1/2 1/2 
[i u(B)’e(—NB)dB < [i ayn \u(B) ap 
1/2 
K p-*dB 
Q/N 
N2 
< Q 
Similarly, 
~Q/N ; 
[ ip MB e-NB MB «Xt a 
By Lemma 8.1, 
Q/N 1/2 
| yt Be NBMB = [i u(B)3e(—NB)dB + O(N20~) 
N2 N2 
= = + O(N) +0 (5) 
N2 N2 
“3 9 (a) : 
By Theorem 8.2, 
1 
G(N, Q)=G(N)+O (sc :). 
Therefore, 


I F(a)e(—Na)da 
IM 


Q/N 5 N? 
= 6(N, 2 Lan u(B)-e(—NB)dB + O (anes) 


N2 
ew) +0 (Fa -)+ 0 ( a5 a) 


N? N? 
= 6(N)— + O | ——————.; ] + O | ————— |. 
7 2 +0 N)O~08 (we N)o~>8 
This completes the proof. 
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8.5 An exponential sum over primes 


To estimate the integral over the minor arcs, we shall apply Vinogradov’s estimate 
for the exponential sum F(a). The proof is based on a combinatorial identity of 
Vaughan. 


Theorem 8.5 (Vinogradov) If 


where a and q are integers such that 1 < q < N and (a, q) =1, then 


N 
F(a) < (rs + N4/ + N1/2 a’) (log N)*. 


The proof is divided into a series of lemmas. The first is an identity involving 
arithmetic functions of two variables and truncated sums of the Mobius function. 


Lemma 8.4 (Vaughan’s identity) For u > 1, let 


Mu (k) = ) | u(d). 


d\k 
d<u 


Let ®(k, £) be an arithmetic function of two variables. Then 


>> (1, 2) + > > M,,(k) ®(k, 2) = > da d u(d)®(dm, 2). 


u<l<N u<k<N u<t<4% d<u y<e<X Ym<i 


Proof. We shall evaluate the sum 


N 


in two different ways. Since 


ifn = 1 
dH (¢) =| 0 otherwise, 
it follows that 
1 ifk=1 
Matt) =| 0 ifl<k<u. 


Therefore, 


S= >> O(1,£)+ > Yo Milk) PE, 2). 


u<l<N u<k<N u<l<* 
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On the other hand, interchanging summations and letting k = dm, we obtain 


Lemma 8.5 


where 


and 


N 
=D DOK, O 


k= =l u<es% a 


d<u 


> d HME, 4 


d<u \ yce< 


=D, dL, uido(dm, 6 


d<u ms* u<e<* 


=D LD, uldo(dm, 6). 


dsu y<t<% mst 


Let A(£) be the von Mangoldt function. For every real number a, 


F(a) = S; — Sy — 83+ O(N”), 


=) » » u(d)A(L)e(adém), 


d<N2/5 e<*% m<* 


S= DS) dD Do eMAMe(adem), 


d<N2/5 £<N2/5 m<% 


S3 = > > Myzis(k)A(l)e(ake). 
k>N2/5 N25 <£<N/k 


Proof. We apply Vaughan’s identity with 


and 


u = N7/> 


O(k, £) = A(L)e(ak£). 


The first term in Vaughan’s identity is 


> ®(1, 2) = > A(£)e(a£) 


u<l<N 


N25 <£<N 


= Yo A@elat)- > A(£)e(a£) 


¢ << N2/5 


= y (log p)e(ap*) + O (N° log N) 
pk <N 


= Y | dlog p)e(ap) + > (log p)e(ap*) + O (N*’ log N) 


psN pk <n 
k>2 
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= F(a)+O ) | log p + O (N*? log N) 


pk <n 
k>2 
log N 
= F(a)+O —— | lo +0 (N27 log N 
(a) LD lie | gp | + 0 (N2!> log N) 


= F(a) + O (x(N'*)log N) + O (N?” log N) 
= F(a)+ O(N"), 


since 
1/2 


N 
n}/2 
aN) <& log N 
by Chebyshev (Theorem 6.3). 
The second term in Vaughan’s identity is simply 


S> Myrs(k)A(e(ake) = Ss. 


N25 <k<N N25 <e<4 
The third term in Vaughan’s identity is 


> Y > u(d)A(£)e(adém) 


2/5 yy2/5 N N 
d<N*P N25 <l<* me 


-)> S> do u@A)e(adém) 


2/5 po oN 
d<N lx G msq 


_ > > > u(d)A(£)e(ad£m) 


d<N2/5 @<N2/5 ms, 


= S$; — S82. 


This completes the proof. 
In the next three lemmas, we find upper bounds for the sums S), S2, and $3. 


Lemma 8.6 Jf 
wo 


1 
=> 
q 


q 


where 1 < q < N and (a, q) =1, then 
N 2/5 2 
Si] « ain +q } (log N)°*. 


Proof. Let u = N*/°. Since )°,,, A(€) = logr, we have 


S, = > > u(d)A(L)e(ad£m) 


d<u N yen 
Ss {<5 ms 
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= - > u(d)A(£)e(ad£m) 


d<u €m<N/d 


=>) dy M@e(adr) 7 AL) 


d<ur<N/d L|r 


= } > ud) >> e@dr)logr 


d<u r<N/d 


«> 


d<u 


> e(adr)logr|. 


r<N/d 


We compute the inner sum by writing the logarithm as an integral and interchanging 
summations: 


" dx 
> e(adr)logr= ) | e(adr) | — 


r<N/d <N/d 


> 
dx 
= y > e(adr) vfs * 
2 


r= 


[N/d] [N/d] 
d 
-) ef e(adr) — 
s~l x 


s=2 r=s 


[N/d] [N/d] 
-)\ [ (5 > coar)) © 
s=2 vs—l 


By Lemma 4.7, the geometric progression inside the integral sign is bounded above 
by 


[N/d] 
) | e(adr) « min ee lad || \ 


r=s 


and so N 
> e(adr)logr < min & jedi) log N. 
r<N/d d 


By Lemma 4.10, we have 
] N 2/5 
> min (= —,llad| |] < | — +N? +q JlogN. 
d<u q 
Therefore, 


S1< ) > min (7. llad ||~ *) tog w 


d<u 
N 2/5 2 
< 7°) +q } (log N)’. 


This completes the proof. 
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Lemma 8.7 If 
ao 


1 
< 
q|- q? 


q 


where 1 < q < N and (a, q) = 1, then 
N 4/5 2 
IS2l<« 7 + N*> +q } (log NY’. 


Proof. If d < N*/ and £ < N7/>, then dé < N‘/5, Making the substitution 
k = dé, we obtain 


S= > Y > u(d)A(C)e(ad£m) 


d<N25 0<N25 me 


= > (> cakm)) > U(d)A(£) 
m<N/k 


<N4/5 kndé 
KEN d,€<N2/5 


Since 


> u@ae« > A(t) <)> A(é) = logk < log, 


kad =dé elk 
d,t<N2/5 d,é<N2/5 


it follows again from Lemma 4.10 that 


So < log N s> > e(akm) 


k<N4/5 m<N/k 


N 
«< >> min (tet) tog 


k<N‘%/5 
N 4/5 2 
<K 7° +q](logN)’. 
This completes the proof. 


Lemma 8.8 If 
a 
a — — 


1 
<< — 
q — 2? 


q 


where 1 < q < N and (a, q) = 1, then 


N 
|S3| <« (= +N 4 wing’) (log N)*. 


Proof. Let u = N2/> and 


l 
5 log 2 
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Then N'/5 < 2" <2N./5 andh « logN. Ifi <h, then 2'u < 2N°? < N. If 
N2/5 < £< N/k, then 


k<N/€<N?% =N Pu < 2"u, 


and so 
S3 = > > M, (k)A(£)e(ake) 
k>N2/5 N25 <€<N/k 
h 
=> Yo Mk) YO Ae@ke) 
i=] 2i-ly<k<2iu u<l<N/k 
h 
= Y— S3,i, 
i=1 
where 


Si= >> Milk) Y> A@e(ake). 


2i-lu<k<2iu u<l<N/k 


By the Cauchy—Schwarz inequality, 


Isr < Yo IMP. dO 


Qi-lu<k<2'u Qi-lu<k<2u 


2 


> A(£)e(ake)| . (8.5) 


u<l<N/k 


We shall estimate these sums separately. 
To estimate the first sum in (8.5), we observe that 


IM.(kK)| =|) wd)| < 951 sd), 
d|k d|k 
d<u d<u 
where d(k) is the divisor function. It follows from Theorem A.14 that 
yim? < dL dey 
Qi-lw<k<2u Qi-lu<k<2'u 
< 2'u (log iu)” 
< 2'u (log N)>. 


Next, we estimate the second sum in (8.5). We have 


2 


> A(£)e(ake) 


u<l<N/k 


Qi u<k<2iu 


_ y_ - > A(L)A(m)e(ak(€ — m)) 


Qi-lnck<2iy u<l<N/k u<m<N/k 


- YS YS AMA) YS eke -m)), 


u<l<—N— u<m<—"- kel(£,m) 
2h gi-ly 
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where /(£, m) is the interval of consecutive integers k such that 


. NWN 
yy <k < min (2'u —, ~ ; 
Lom 


Clearly, 
[1(é, m)| < 2'~"u, 


and so | 
> e(ak(é — m)) « min (2'"'y, lla(é — m)|7"). 
kel(€,m) 


Since 0 < A(£), A(m) < log N for all integers 2, m € [1, N], we have 


3B 


lu <k<2'u 


2 


>. A(£)e(ake) 


u<l<N/k 


K« » Y> A(é)AGn) min (2'~!u, |lae(é — m)|\-") 
u<l<N/(2'-'u) u<m<N/(2'-'u) 
K(logNy > Y> min (2'71u, |la(é — m)||7"). 


u<l<N/(2'-!u) u<m<N/(2'-'u) 


Let j =€—mwithu < £,m < N/(2'—'u). Then |j| < N/2'~'u, and the number 
of representations of an integer j in this form is at most N/2'~'u. By Lemma 4.10, 
we have 

2 


> A(L)e(ake) 


u<l<N/k 


luck <2iu 


N ee oe 
< (log NY = > min (2 lu, ea all ') 


1<j<N/2'-!'u 
N eee 
K(logNY=—- min (= tat ; 
a 1<j<N/2!-'u J 


N [(N N 
. — +—— +q | (log N)°. 
« Ji-ly (- di-ly a) (og ) 


Inserting this into inequality (8.5), we obtain 


N N N 
|53,;|? < (2'u(log N)°) Te (= + ai=1, +a) (log Ny 


1 1 
« N*(log N)° (- +—+ 4) 
gq u N 


Therefore, i 
1 1 q 
3 
|$3,i] «K N(log N) (<a + vis * a 
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Since h < log N, we have 


N 
S3 = 57 Sy, < (log N)* (<a sen 4 gin?) 
i=] 


This completes the proof. 

Finally, we obtain Vinogradov’s estimate for the exponential sum F(q@) by in- 
serting our estimates for the sums S,, S2, and S3 into Lemma 8.5. This completes 
the proof of Theorem 8.5. | 


8.6 Proof of the asymptotic formula 


We can now estimate the integral over the minor arcs. 


Theorem 8.6 For any B > 0, we have 


2 


3 
[ F(aye(-aN)da « (log NY@/D-5° 


where the implied constant depends only on B. 


Proof. Let a € m = [0, 1] \ St. By Dirichlet’s theorem (Theorem 4.1), for 
any real number a there exists a fraction a/q € [0,1] with 1 < g < N/Q and 
a 
(04 — oe 


(a, q) = 1 such that 
< Q < min g i . 
q\| 4qN N q 


If g < QO, thena € Nt(q, a) C M, which is false. Therefore, 


Q<q<—. 
Q 


By Theorem 8.5, 


N 
F(a) < & + N49 4 N1/2 a”) (log N)* 
q'! 


« Ns yas 4 yt (_ “ (log N)* 
(log N)3/? (log NV)? 


N 
S Gog NYER-* 


Since 0(N) = ),<y log p « N by Theorem 6.3, we have 


1 
| |F(a)/?da = ) “(log p)* < logN )° log p K NlogN, 
0 p<N p<N 
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and so 


| |F(a)[P>da < sup{|F(a)| : a € m} [ |F(a)|?da 
m 
2 
< a |F(a)|“da 
N2 
S Gog yeas" 
This completes the proof. 


Theorem 8.7 (Vinogradov) Let G(N) be the singular series for the ternary 
Goldbach problem. For all suffciently large odd integers N and for every A > 0, 


N? N? 
R(N) = G(N)— + O | ———— ], 
eng (aan) 
where the implied constant depends only on A. 


Proof. It follows from Theorem 8.4 and Theorem 8.6 that, for any positive 
numbers B,C, andeé withC > 2B, 


l 
R(N) = | F(a)’e(—Na)da 
0 


z I F(a)3e(—Na)da + [ F(a)’e(—Na)da 
Sy m 


N2 N? 
SGN) 40 (= 
oo Cao, 


O N’ +O N 
(log N)&->8 (log N)(8/2)-5 J’ 


where the implied constants depend only on B,C, and e. For any A > OQ, let 
B=2A+10andC =A+5B.Lete = 1/2. Then 


min((1 — ¢)B, C — 5B, (B/2) —5)=A, 


and so 


N? N? 
R(N) = G(N)— + O | ———— ] . 
m= ey +0 (corn) 
This completes the proof. 
We can now derive Vinogradov’s asymptotic formula for r(N). 
Proof of Theorem 8.1. We get an upper bound for R(JV) as follows: 


R(N)= > log pi log prog p; 
Pitp2tp3=N 


<(logNy DDI 
Pitp2t+p3=N 
= (log N)’r(N). 
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For 0 < 5 < 1/2, let r3(N) denote the number of representations of N in the form 
N = p; + p2 + p3 such that p; < N'~° for some i. Then 


r3(N) < 3 > 1 


Py+P2+p3=N 
py sN 1-4 


« > ( > ) 
pi<N'-* \potp3=N—-pi 
© (x3) 
pi<N'-§ \po<N 

x(N'~°)x(N) 

N2-6 

< (log Ny” 


lA 


lA 


We can now get a lower bound for R(N): 


R(N) >= log pi log pa log ps 


Py +P2+P3=N 
P1.P2.p3>N!~4 


> (1—sy(ogny 


P\+P2+P3=N 
P1»P2.P3>N'-8 


> (1 — 6)°(log N)°(r(N) — rs(N)) 


3 3 N27 


Therefore, 
(log N)Pr(N) < (1 — 4)? R(N) + (log N)N?~°. 
If0 <5 < 1/2, then 1/2 < 1—6 < 1 and 


— 1-89 


0<(1—5)°?-1= < 8(1—(1—4)°) < 246. 


a-6P 7— 
By Theorem 8.7, R(N) < N2 and so 
0 < (log N)3r(N) — RN) < ((1 —6)73 — 1) R(N) + (log N)N?® 


< 5R(N) + (log N)N?~° 
« 5N? + (log N)N*° 


log N 
- ni(o4 °8 ). 
Né 


This inequality holds for all 6 € (0, 1/2), and the implied constant does not depend 
on 6. Let 


_ 2loglog N 
logN — 
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logN  2loglogN log N log log N 
N? log N (log N)2 log N 


O+ 


and so 


N2 loglog N 
0 < (log N)°r(N) — R(N) « 8 28 
log N 


Let A > 1. By Theorem 8.7, 


2 
(log N)?r(N) = R(N) + O (ee) 


log N 


N?2 N2 N* log] 
- 6) +0 (7 +o (X_toglogN 
2 (log N)4 log N 


N? log log N 
= 6(N)— (1 wee” |). 
eo (+0 (Arex) 


Dividing by (log N)°, we obtain 


_ N? log log N 
r= 8 agcens (1* 0 (Seen): 


This completes the proof. 


8.7 Notes 


For Vinogradov’s original papers, see [132, 133]. Vaughan [124] greatly simplified 
Vinogradov’s estimate for the exponential sum F(a) (Theorem 8.5), and it is 
Vaughan’s proof that is given in this book. There are many good expositions of 
Vinogradov’s theorem. See, for example, the books of Davenport [19], Ellison [29], 
Estermann [38], Hua [64], Vaughan [125], and Vinogradov [135]. 

Vinogradov’s theorem implies that almost all positive even integers can be writ- 
ten as the sum of two primes. This was observed independently by Chudakov [14], 
van der Corput [123], and Estermann [37]. Let E denote the set of even integers 
greater than two that cannot be written as the sum of two primes. The set E is called 
the exceptional set for the Goldbach conjecture. Let E(x) denote the number of 
integers in E not exceeding x. The theorem of Chudakov, van der Corput, and 
Estermann states that E(x) <, x/(logx)“ for every A > 0. Montgomery and 
Vaughan [84] proved that there exists 6 < 1 such that E(x) <« x®. Of course, if 
the Goldbach conjecture is true, then E(x) = 0 for all x. 


8.8 Exercise 


1. Leth > 3. Find an asymptotic formula for the number of representations of 
a positive integer N =h (mod 2) as asum of h prime numbers. 


9 


The linear sieve 


9.1 


In the next chapter, we shall prove Chen’s theorem that every sufficiently large 
even integer can be written as the sum of a prime and a number that is the product 
of at most two primes. The proof will require more sophisticated sieve estimates 


We often apply, consciously or not, some kind of sieve procedure 
whenever the subject of investigation is not directly recognizable. We 
begin by making a long list of suspects, and then we sort it out gradu- 
ally by excluding obvious cases with respect to available information. 
The process of exclusion itself may yield new data which influences 
our decision about what to exclude or include in the next run. When no 
clue is provided to drive us further, the process terminates and we are 
left with objects which can be examined by other means to determine 
their exact identity. These universal ideas were formalized in the con- 
text of arithmetic back in the second century B.C. by Eratosthenes, 
and are still used today. 


H. Iwaniec [68] 


A general sieve 


than those obtained from the Selberg sieve in Chapter 7. 


We begin by generalizing our concept of a sieve. Let A = {a(n)}", be an 


arithmetic function such that 


a(n) > 0 for all n 
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and 
CO 


JA] =) a(n) < 00. (9.2) 
n=1 
Let P be a set of prime numbers and let z be a real number, z > 2. The set P is 
called the sieving range, and the number z is called the sieving level. Let 


P(z)=[ |p. 


peP 
pz 


The sieving function 1s 


S(A,P,z)= > a(n). 


(n, P(z))=1 


The goal of sieve theory is to obtain “good” upper and lower bounds for this 
function. 

For example, let A be the characteristic function of a finite set of positive integers, 
that is, a(n) = 1 if n is in the set and a(n) = 0 if n is not in the set. Then |A| is 
the cardinality of the set. The sieving function S(A, ?, z) counts the number of 
integers in the set that are not divisible by any prime p € P, p < z. This special 
case is exactly the sieving function for which we obtained, in Chapter 7, an upper 
bound by means of the Selberg sieve. 

Using the fundamental property of the M6bius function, that 


(1 * w)(m) = >> u(d) = 


ifm > 1, 
d\|m 


1 ifm=1 
0 


where 1 denotes the arithmetic function such that 1(”) = 1 for alln > 1, we obtain 
Legendre’s formula 


S(A,P,z)= > an) 


(n, P(z))=1 


=a) Dd) w@ 


d\(n, P(z)) 


= >) Hd) Doan) 


d|P(z) d|n 


= > u@)lAal, 


d|P(z) 


|Aa| = ) a(n) 


d|n 


where the series 


converges because of (9.1) and (9.2). 
We shall assume that, for every n > 1, we have a multiplicative function g,,(d) 
such that 
0 < gn(p) < 1 
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for every prime p € P. Then 
0<g,(d) <1 


for every integer d that is the product of distinct primes p € ?P. For such integers 
d, the series 


Y | a(n)gn(d) 
converges, and we can define the remainder r(d) by 


|Aal = > a(n)gn(d) +r(d). 


Inserting this into Legendre’s formula, we obtain 


S(A,P,z)= >> w@)lAal 


d|P(z) 
=> u@) (» a(n)gn(d) + no) 
d| P(z) 
= dam) > wd@)gn(d)+ D> wd)r(@) 
d|P(z) d|P(z) 
=) a(n) |] d—an(p) + >) u@r@) 
n P\P(z) d|P(z) 
= > a(n)Vn(z) + RC); 
where 
Vi(z)= [| Cd - arp) 
P|P() 
and 


R@) = D> u(d)r(d). 


d|P(z) 


If P(z) has a large number of divisors, the remainder term R(z) in Legendre’s 
formula may be too large to give useful estimates for S(A, P, z). For example, let 
A be the characteristic function of the set of all positive integers not exceeding x, 
and let P be the set of all prime numbers. Let 


dul 
8n( Jaa 


for all n. Then 


for all n > 1. Moreover. for all d > 1, 


A 
0< radl= lal = - [=] <1, 


234 9. The linear sieve 


and so 


IR@I< do Ir(@)| <2". 


d|P(z) 


It follows from Legendre’s formula that the number of integers up to x divisible 
by no prime less than z is 


S(A, P, z) = [x] ] | ( ~ * +O (27). 
p<z 


By Mertens’s formula (Theorem 6.8), 


1 ev 1 
I] (1 _ 4 = ( +O (=<) (9.3) 
pez Pp log z log z 


and so the remainder term will be larger than the main term unless z is very small 
compared to x. 

The sieve idea is to reduce the size of the error term by replacing the Mobius 
function with carefully constructed arithmetic functions 4*(d) and ~(d) such that 


A*(1) =A7(1) = 1 (9.4) 
and, for every m > 2, 
(1 *A*)(m) = 5 a*(d) > 0 (9.5) 
d|m 
and 
(1 *A7)@m) = }/A7(d) <0, (9.6) 
d|m 


Let A*(d) and 4~(d) be arithmetic functions that satisfy (9.4), (9.5), and (9.6). If 
D is a positive number such that A*(d) = 0 for all d > D, then the arithmetic 
function A*(d) is called an upper bound sieve with support level D . Similarly, if 
D is a positive number such that 4~(d) = 0 for all d > D, then the arithmetic 
function A~ (d) is called a lower bound sieve with support level D. 

If P is a set of primes such that A*(d) = 0 whenever d is divisible by a prime not 
in P, then A*(d) is called an upper bound sieve with sieving range P. Similarly, 
if A~(d) = 0 whenever d is divisible by a prime not in P, then A~ (¢) 1 is called a 
lower bound sieve with sieving range P. 

The following result is the basic sieve inequality. 


Theorem 9.1 Leta*(d) be an upper bound sieve with sieving range P and support 
level D, and let X” (d) be a lower bound sieve with sieving range P and support 
level D. Then 


> 4)Gp(z, 47) + Ro S S(A, P, 2) < Yo aln)Guz, A*) + RP, 
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where 


Gi(z,A*)= S > A*@)gn(d) 


d| P(z) 


and 


+. > r*(d)r(d). 


d|P(z) 
d<D 


Proof. Since the arithmetic function A*(d) is supported on the finite set of 
integers 1 < d < D, it follows that the series 


dan) y> ard) 
d\(n, P(z)) 


converges. By conditions (9.4) and (9.5), the inner sum is 1 if (n, P(z)) = 1 and 
nonnegative for all n. Therefore, 


S(A,P,z)= > a(n) 
(n, P(z))=1 


<dian) > at@) 


d|(n, P(z)) 
= \- A*(d) ) a(n) 
d|P(z) d\n 
= Do M@IAdl 
d| P(z) 
- > v@) (> a(n)gn(d) + ro) 
d|P(z) 
= LMAO) aman(a+ J) @r(a) 
d|P() alP@) 
= dan) > A*(d)gn(d)+ ) | *(d)r(d) 
n d|P(z) dl P(z) 


d<D 


= > a(n)Gn(z, A*) + Rt. 


The proof of the lower bound is similar. 
The following result shows how to extend the sieving range of upper and lower 
bound sieves by any finite set of primes. 


Lemma 9.1 Let AF(d) be upper and lower bound sieves with sieving range P, 
and support level D. Let Q be a finite set of primes disjoint from P,, and let Q be 
the product of all primes in Q. Every positive integer d can be written uniquely in 
the form 

d =dd), 


where d, is relatively prime to Q and dz is a product of primes in Q. Define 


M*(d) = AF (di) (2). (9.7) 
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Then the function 4*(d) (resp.  (d)) is an upper bound sieve (resp. lower bound 
sieve) with sieving range 


P=P,UQ 


and support level DQ. 
Let g be a multiplicative function, and let 


G(z,A*)= D> A*@Me@) 


d|P(z) 
and 
Giz,AF)= D> AFG )g(a). 
d\|P\(z) 
Then 


G(z, A*) = G(z, AF) |] —28@)). 
q\ Q(z) 


Proof. Clearly, 4*(1) = 4~(1) = 1. Every positive integer m factors uniquely 
into a product m = m,mz, where m, is relatively prime to Q and mz is a product 
of primes in Q. We have 


wrO@= D0 DM Ga) 


d\|m d|m, d2|m2 
=> NG) >— ud) = 0 
d\|m, d2|m2 


since 
_f 1 ifm,=1 
D_ H@2) = | 0 ifm >2. 
d2|m2 
Similarly, if m =m mz > 1, then 


Ye @= DIAG) DHA) < 0 


d|m d\|m d2|m2 


Y> uaa) = 0, 


d2|m2 


since either m2 > 1 and 


or m2 = 1, which implies that m, > 1, and so 


Yo Aid) < 0. 


d\|m 


Thus, the arithmetic functions A*(d) satisfy conditions (9.4), (9.5), and (9.6). 
Since A*(d) = 0 if d is divisible by some prime not in ?, it follows that the 
functions A* have sieving range P. 
Let d = dd», where d is relatively prime to Q and d) is a product of primes 
in Q. If d = d\d) > DQ, then either d, > D and Ay (d,) = 0, or d, > Q, which 
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implies that d2 is divisible by the square of some prime g € Q, and so p(d) = 0. 
In both cases, 4*(d) = 0. Therefore, the functions A+(d) = 0 have support level 
DQ. 

Finally, since P(z) = P;(z) Q(z), 


G(z,A*)= S > A*(d)g(d) 


d|P(z) 


= > Gide) g(aide) 
d;|P\(z) d2|Q(z) 


= Yo Yt ARG) g(di) ular) g(r) 
d|P\(z) d2| Q(z) 


= D7 ATG) DY wdr)g(r) 
dy |P\(z) d2| Q(z) 


= G(z, at) |] G—8@)). 
q| Q(z) 


This completes the proof. 
Combining Theorem 9.1 and Lemma 9.1, we obtain the following result, which 
is an important refinement of the basic sieve inequality. 


Theorem 9.2 Let AF(d ) be upper and lower bound sieves with sieving range P, 
and support level D. Let | AF(d)| < 1foralld = 1. Let QO be a finite set of primes 
disjoint from P, and let Q be the product of the primes in Q. Let P = P; U Q. 
For each n > 1, let g,(d) be a multiplicative function such that 


0 <g,(p) <1 forall péP. 


Let 
Griz Ar)= D> AP) gn (a). 
d| P;(z) 
Then 
S(A,P, 2) < )a(n)Ga(z, At) | | Gd — gn(q)) + R(DQ, P, z) 
n=] q| Q(z) 
and 
S(A,P,z) >=) a(n)Gn(z,4;) | |] @ — 8n(g)) — R(DQ, P. 2), 
n= q| Q(z) 
where 


R(DQ, P, z) = y Ir(d)|. 
d|P(z) 
d<DQ 
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It often happens in applications that the arithmetic functions g,(d) satisfy one- 
sided inequalities of the form 


_ log z \“ 
II Cd — 2n(P)) < K (Pe ) ’ 
peP Ogu 
uSp<z 


where K > 1 andx« > O are constants that are independent of n, and the inequality 
holds for all n and 1 < u < z. In this case we Say the sieve has dimension x. The 
case kx = | is called the linear sieve. The goal of this chapter is to obtain upper 
and lower bounds for the linear sieve that were first proved by Jurkat and Richert 
(Theorem 9.7). This is the only sieve inequality that is needed for Chen’s theorem. 


9.2 Construction of a combinatorial sieve 


In acombinatorial sieve, we reduce the size of the error term in Legendre’s formula 
by replacing the Mobius function with its truncation to a finite set of positive 
integers. This idea goes back to Viggo Brun [7]. We construct these truncated 
functions in the following theorem. 


Theorem 9.3 Let B > 1 and D > 0 be real numbers. Let D* be the set consisting 
of 1 and all square-free numbers 


d = Pi p2-** Dk 


such that 
Pr<-+'<pr2<pi<D 


D 1/B 
Pn < (—— 
Pi P2°°* Pm 


for all odd integers m.Let'D be the set consisting of 1 and all square-free numbers 


and 


d = Pi p2--* Dk 


such that 
Pe<-++<po<pi<D 


D 1/B 
mn <(—2—) 
PiP2°** Pm 


for all even integers m. Then the sets D* and D~ are finite sets of square-free 
positive integers d < D. Let P bea set of primes, and let P(D) denote the product 
of all of the primes in that are less than D. Define the arithmetic functions i*(d) 
and i. (d) as follows: 


and 


A*(d) = uU(d) ifd € D* andd|P(D) 
0 otherwise 
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and 
_ u(d) ifdeD  andd|P(D) 
X(d) = ; 
0 otherwise. 


Then i*(d) and dX (d) are upper and lower bound sieves with sieving range P and 
support level D. 


Proof. The condition 


D 1/B 
pn <(— 
Pi P2°°* Pm 


P1P2***Pm-1Py? < D. 
Let d = pi -:: px € D*. If k is odd, then 


is equivalent to 


1 
d = Pi: Pk-1Pk < Pi-’ - Dk-1P, < D. 


If k is even, then k — 1 is odd. Since py < px—; and B > 1, we have 
d = Py-** Pk-1Pk < Pi-** Pea < Pies Pye < Dz 


Therefore, 1 < d < D foralld € D*. 

Similarly, if d = p,--- px, € D” andk > 2, then 1 < d < D. Fork = 1, we 
have d = p, < D, that is, D~ contains all primes strictly less than D. Therefore, 
1<d<DforalldeD. 

The arithmetic functions A*(d) and A” (d) are truncations of the Mobius function 
(d) to certain subsets of the sets D* and D~, respectively. Since both sets contain 
1, we have 

at(1) =47() = (1) = 1. 


Let m > 2. We must prove that 


ya@ <0< > A*(d). (9.8) 


d|m d\|m 


Since the functions 4*(d) are supported on divisors of P(D), we may assume that 
m divides P(D). Let w(m) denote the number of distinct prime divisors of m. The 
proof is by induction on k = w(m). If k = 1, then m = p < D for some prime 
p €P, and som € D~. We have 


> * @ = HU) + w(p) = 0 


d|m 


and 


> °@) = wl) +A*(p) = 1-1-0. 
d|m 


This proves the lemma in the case k = 1. 
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Now let k > 1, and assume that inequalities (9.8) hold for all positive integers 
m with k distinct prime divisors. If w(m) = k + 1, then we can write m in the form 


m = q041 °° * Qk; 
where 
dk < Gk-1 <°**' << 41 < go < D, 

Go» V1>--+» Qk are prime numbers in ?, and qo is the greatest prime divisor of m. 
Let 

m 

my, = =41°°:dk- 
qo 


Since m, 1s a divisor of P(z) with k prime factors, it follows from the induction 


hypothesis that 
yard) <0< Sad). 
d|m, d|m, 


Every divisor of m is of the form d or god, where d is a divisor of m,. Therefore, 


DV @ = DOM @+ DA od) 
d|m d|m, d|m, 
> > A*(qod) 
d\m, 


>> H(Qod) 


d|m, 
qgdeD* 


— Do #@). 


d\my 
ggdeDt 


Similarly, 


A @™s- VO u@. 


d\m dim 
qgdED— 


If d is a divisor of m , then 
d = Pi eee Pj; 
where pi,..., pj are primes in P such that 


Pj <:*-<pPi<q <q < D. 


Let D, = D/qo > 0, and let Dj and D_ be the sets of integers constructed from B 
and D,. Let Aj(d) and 1, (d) be the Mobius function truncated to the sets Dy and 
D, , respectively. Then god € D* if and only if 


p\'/b 
qo < (2) 
90 
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oy, < (_?_\" - (> _)" 
m qoPi°:: Pm Pi-°*: Pm 


for all even integers m. If 
p\1/6 
40 


and 


then god ¢ D* and so 


since the sum is empty. If 


p\ 6 
qo < (2) , 
40 
then god € D" if and only if d € D, , and 


dH) = DY) wd) = dar @) <0 


d\m d\m, d|m, 
qgdeDt deD, 


by the induction hypothesis. Therefore, 
yard) = 0. 


d|m 


Similarly, gad € D~ if and only if d € D7, and so 


dD) Hd) = D7 u@ = DU N@ = 0. 


d|m d\my d|m, 
qgdeD— deDy 
This proves that A*(d) and A~ (d) are upper and lower bound sieves with sieving 
range ? and support level D. 


Lemma 9.2 Let P be a set of primes, and let g(d) be a multiplicative function 
such that 
0 < g(p) < 1 forall peéeP. 


Let 
Viz)=] Jd -stp)= >> ug). 


peP P| P(z) 


p< 


Then V(z) is a decreasing function of z, 
0< V(z) <1 


for all z, and 
y> 8(p)V(p) = V(w) — Viz) (9.9) 


peP 
wsp<Z 


forall<w <z. 
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Proof. It follows immediately from the definition that V(z) is decreasing and 
V(z) € (0, 1] for all z. 

The proof of the combinatorial identity (9.9) is by induction on the number k of 
primes p € ? that lie in the interval [w, z). If k = 0, then V(w) = V(z) and 


\— a(p)V(p) = 0. 


peP 
wesp<z 


If k > 1, let p; be the largest prime in the interval. Then 
> a(p)V(p) = >> a(p)V(p) + 8(P1)V(p1) 


peP peP 
wsp<z w<p<p| 


= V(w) — V(pi) + 8(p1) V(P1) 
= V(w) — (1 — g(pi))V(p1) 
= V(w) — V(z). 


Lemma 9.3. Let P be a set of primes. For B > 1 and 2 < z < D, let 


D 1/B 
ym = Ym(B, D, Pis--+sPn)=(——} | 
Pi ee - Dm 


Let 4*(d) be the upper and lower bound sieves constructed in Theorem 9.3, and 
let 


G(z,A*) = 0 A*(@)g(d). 


d|P(z) 
Let 
T,(D, 2) = » 8(Pi*** Pn)V (Pn). 
sci 
Then x 
Giz, aty=V(z)+ > T,(D,z) (9.10) 
n=l "nod 2) 
and x 
Giz,A7)=V(z)— > T,(D, 2). (9.11) 
n=O "nod 2) 
Moreover, 
T,,(D, z) > 0 


for alln > 1, and 
G(z,A~) < V(z) < Gz, A"). 


If 


then 
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Proof. It follows from the construction of the sets D+ and the sieves A*(d) that 
G(z,A*) = D> wd) gn(@) 


d|P(z) 
dept 


= Yd (©) gn(pi +++ Pe) 


Py<o<Py <2, pjEeP 
Pm<J¥m Vm=1 (mod 2) 


and 


G(z,A7) = D> w@)gn(d) 


d|P(z) 
deD~ 


S51 8n(p1 ++ Ped: 


Py <-°<P| <2, Pj eP 
Pm<ym¥m=0 (mod 2) 
We expand the function V(z) to obtain a partition of G(z, 4*) as a sum of nonneg- 
ative functions: 


Vid= > w(d)g(d) 


d|P(z) 


Y> (-1)g(pi- +: Pe) 


Pk <i <P] <2 
Pj eP 


Yo (Di g(pi- + Pe) 


Py<e <P] <Z,pjEP 
Pm <ym¥m=l1 (mod 2) 


+ > (=1ég(pi- +: Pe) 


Pk <-<py <2, pj€P 
Jm=iv- (mod 2):ym <pm 


= G(z, A*) + > (—1)"g(pi +++ pe) 


Py <<py <Z,pjeP 
3m=1 (mod 2):ym<pm 


=Gz,A*)+ DO YC DE (pi: Ped) 


n=] P<'<Py <2, pjeP 
n=l (mod 2) pm<ymWm<n,m=1 (mod 2) 
Yn SPn 


= G(z, A*) 


+> DCD" g(Pie Pn) DDE (Pe ++ Post) 


n=] Yn SPn <-<py <2 Pk <°<Pn+| <Pn 
n=l (mod 2) pieP pieP 
Pm<ymV¥m<n, 
m=! (mod 2) 


ie, @) 


=Gz,r)- YO > B(P1- Px) V (Pn) 


n=] yn SPn <-- <p <2, pj EP 
n=l (mod 2) Pm <ym¥m<n, 
mzl (mod 2) 


oo 


=G(z,a*)— > 1,(D,2), 


n=} 
n=l (mod 2) 


244 9. The linear sieve 


where 
T,(D, z) = > 8(P1--* Pn)V(Pn) = O. 
Ya SPn <-<py <2,pj€P 
Pm<J¥m¥mn<n,m=n_ (mod 2) 
Therefore, 
(o,@) 
GZ AM*)=V@)+ D> T,(D,z) = V(2). 
n=] "nod 2) 
Similarly, 
[o @) 
G(z,A)=VQQ)- )) T,(D,z) < VQ). 
n=0 "nod 2) 
If 
Yn S Pn < +++ < pi <Z, (9.12) 
then 


D < pi--- Pape < 2"*8. 


Let D = 2’. Since T,,(D, z) is a sum over integers p --- p, that satisfy inequal- 
ity (9.12), it follows that 7,(D,z) = 0 unless s < n+ B. This completes the 
proof. 


9.3 Approximations 
For the rest of this chapter, we shall consider only the case 
p=2 


in the construction of the sets D* and the upper and lower bound sieves A*(d). 


Then 
D 1/2 
m= ( ) ] 
P) *** Dm 


and the functions 7,,(D, z) satisfy the following recursion relation. 


Lemma 9.4 Let z > 2 and D be real numbers such that 


log D 1 ifn is odd, 
= pos 
— | 2. ifn is even. 


Then 
T,(D, z) = V(D'”7) — V(z). (9.13) 


Letn > 2. Ifn is even, or ifn is odd and s > 3, then 


D 
T,(D,z) = > 8(P)Tn-1 (2. P) (9.14) 


peP 
p<: 
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Ifn is odd and 1 <s < 3, then 


D 
T,(D,z)= >> 8(p)Tn-1 (2. r) , (9.15) 


peP 
p<p!/3 


Proof. Since y, = (D/p;)'”, it follows from Lemma 9.2 that 


T(D,z)= )) g(pi)V(pi) 
p\eP 
YESP1 <Z 
= >> g(p)V(p1) 
pyeP 
D'/3 <p) <z 
= V(D"?) — V(z). 
If n is even, then 
T,(D,z)= =>) 8 (Pi-++ Pn) V (Pn) 
Pn<-<p) <Z,pjEP 
Py Pn ppzD 
P\--Pm Py <D 
Ymn<n,m=n (mod 2) 
= >) (p1) > 8(D2--* Pn)V(Pn) 
pjeP Pn<--<py,pjeP 
P\<2 PP PRzD/P| 
ara 
D 
= > 8(Pi)In-1 | —> Pr |} - 
p,yeP Pi 
Py <z 


Let n be odd, n > 3. If py < y; =(D/p;)!” and p; < z= D', then 


1/3: 
p, < min(D'?, D'/’) = | . / r > ; <3 
and the argument proceeds exactly as in the case of even integers n. This completes 
the proof. 

We shall now construct a sequence of continuous functions f,,(s) that will be 
used later to approximate the discrete functions T,,(D, z). For s => 1, let R,,(s) be 
the open convex region of Euclidean space consisting of all points (t;, ..., t,) € R” 
such that 


1 
O<t<-:-<t<-, 
S 
th+--++t,+2t, > 1, 
and 


ty tess tty t+2tn < 1 ifm<nandm=n (mod 2). (9.16) 
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For n > 1 ands > 1, we define the function f;,(s) by the multiple integral 


dt; see dt, 
(s)= fee cl naa 9.17 
sh ) | ho (t1 ++ tn )tn 


The function f,,(s) is nonnegative, continuous, and decreasing, since R,,(s2) C 
Rn(s;) for s; < sz. If f,(s) > 0, then R,,(s) is nonempty, so 7,,(s) contains a 
point (¢;,...,%,). This point satisfies 


n+2 
Ll<t) +---+¢,+21t, <(n+2)t; < oo ? 
S 


and so ; | 
t —, 1 
nan <9 (9.18) 
It follows that 
Sil(s) =9 fors >n+2. (9.19) 


Itis easy tocompute f;(s) and f(s). We have f(s) = Ofors > 3.Forl <s < 3, 
we have 


Ri(s) = (1/3, 1/s) 
and so 


sf\(s) = —=3-5. (9.20) 


Similarly, fo(s) = 0 for s > 4. For 2 < s < 4, we have 


1 1 1-t 
R2(s) = {1.00 —<t,; <—- and at < to <1] 
4 Ss 3 


V/s ph dt» dt; 
Sf2(s) -| / = 
1/4 J(l—n)/3 3 ty 
-{'( 3 -) & 
yw4\l-h t/t 
l/s (3 3 1 
-| ( +-—- — ) dt) 
1/4 1 —t, ty) b; 


= s — 3log(s — 1)+3log3 — 4. 


and so 


The functions f,,(s) satisfy the following recursion relation. 


Lemma 9.5 Letn > 2. Ifn is evenand s > 2, or ifn is odd and s > 3, then 
(o @) 
Sfn(s) = | fn-1(t — I)dt. (9.21) 
Ifn is odd and 1 <s < 3, then 


sfi(s) =3f,(3) = | fr-alt — Dat. (9.22) 
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Proof. If n is even and s > 2, or if n is odd and s > 3, then, from (9.18), we 


have 
dt, ---dt, 
S (s) = / af OE 
In R,(s) (ti +++ tn )tn 


[" | dtp--+dt, | dt, 
1/(n+2) pee, (lar tatn Jot 


ty tettm 4+2tm <1-t 
Vi<m<n,m=n (mod 2) 


In the inner integral, we make the change of variables 


t= (1 — t))uj-1 


fori =2,...,n. Let 


Since t; < 1/s, it follows that s; > lifnmisevenands > 2,ands; > 2 ifn is odd 
and s > 3. We obtain 


{ dty +++ dt, 
ptm ten (t2 sly )tn 


ty te-+tm +2tm <1—ty 
Vl<m<n,m=n (mod 2) 


| d du, 1 
O<u ~p<<uy <ty /(Ul-4) _ tee 
Wy ntl 142-1 >I (1 ty )(uy Un—1)Un—-1 
Uy tet +2] <1 
Vl<m<n,m—l=n—1 (mod 2) 


1 | du,-:-duy_| 
O<uy | <7 <uy <l 
1-t | (Uy Un )Un-1 


Uptertuy —y +2u, 1 >1 
Up tet — 1 +21 <1 
Vl<m<n,m—-1l=n—1! (mod 2) 


1 i du tee du,—1 
1 — ty Rn-1(S1) (uy vee Un—1)Un—1 


| 
|e 
Sh | 
| = 
—_ 
——~ 
a | 
| 

b— 
a” 


Setting t = 1/t,;, we obtain 


l/s l at) 
Sfr(S) = | * h- (- _ 7 ~~ 
I/(n2) ty ty ty 


= fn—ilt — 1)dt 
2 


n+ 


n+2 
-/ fn—1(t — 1)dt 


= [ Fn-1(t — 1)dt, 
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since f,-,(¢ — 1) =O fort —1 > (n — 1) +2 by (9.19). 

Let n => 3 be an odd integer. If (t),...,t,) € R,(s), then t; < 1/s. Also, it 
follows from inequality (9.16) with m = 1 that t; < 1/3, and sot, < 1/max(s, 3). 
Therefore, if 1 < s < 3, then 


Rn(S) = Ral(3) 


and 


shais)= fof (tytn) to dt; ++ dt, 
n(S) 


-[--| (ty -++th) ty dty ++ dt, 
Ru (3) 


= 3 fn (3). 


This completes the proof. 
We construct the function h(s) for s > 1 as follows: 


e~? forl<s <2 
h(s)=¢ e° for2<s <3 (9.23) 
3s7'e-5 for s > 3. 


It is easy to check (Exercise 8) that 
h(s — 1) < 4h(s) for s > 2. 
For s > 2, let 
OO 
H(s) -| h(t — 1)dt. 


Both h(s) and H(s) are continuous, positive, and decreasing functions on their 
domains. Let 


+ 
2h(2) 2 2e 2 J; 


We can express a@ in terms of the exponential integral 
x 
Ei(x) = | et ‘dt 
—OO 


since - 
| e't—'dt = —Ei(—3) = 0.013048... 
3 


We can obtain this number with technology, such as Maple, or without technology, 
either by estimating the integral directly or by looking it up in old books, such as 
Dwight’s Mathematical Tables(26, page 107]. We find that 


a = 0.96068 .... (9.24) 
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Lemma 9.6 
H(s) < ash(s) fors >2 


and 
H(3) < ash(s) for\<s <3. 


Proof. If s > 3, then h(s — 1) < e!~* and 


_ esh(s) 


o.@) 
H(s) < | e''dt =e'-5 < ash(s). 


For 2 < s < 3, let 
Ho(s) = ash(s) — H(s). 


We have 
s—l=1+(s—2) <e&’, 
and so 
(1—s)e* > —e~?. 
Then 


H,(s) = ah(s) + ash'(s) — Hs) 
=a(1—s)e*+h(s — 1) 
>(l—a)e? 
> 0, 


and so Ao(s) is increasing for 2 < s < 3. Since 
H(2) = 0 
by the definition of a, it follows that 
H(3) < H(s) < ash(s) for2 <s <3. 
Let 1 < s < 2. Since a < 1, it follows that h(2) > H(2)/2 and 


H@) = HQ)—e* = H@)—h@) < = 


This completes the proof. 
Lemma 9.7 [fn is odd ands > 1, orifnis e-venand s > 2, then 
frls) < 2e7a""h(s). 
Proof. This is by induction on n. For n = 1, we shall show that 


sfi(s) < 2e*sh(s). 
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(9.25) 


(9.26) 


= ah(2) < ash(2) =ash(s). 
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For 1 < s < 3, we have sf\(s) = 3 — s by (9.20). If 1 < s < 2, then h(s) = e~? 
and 
sfi(s) =3—s <2 =2¢7h(s) < 2e*sh(s). 


If2 <s < 3, then h(s) =e‘ and 
sfi(s)=3—s <1 < 4e*h(s) < 2e*sh(s). 
If s > 3, then f(s) = 0 and 
sfi(s) =0 < 2e*sh(s). 


This proves the case n = 1. 
Now let n > 2, and assume that the lemma holds for n — 1. By (9.21) and (9.25), 
if n is even and s > 2, or if n is odd and s > 3, then 


sfy(s) = / fr-alt — Ide 


CO 
< 2e*a"~? / h(t — 1)dt 


= 2e*¢""*H(s) 
< 2e7a"-*ash(s) 
< 2e7a"—'sh(s). 


By (9.22) and (9.26), if n is odd and 1 < s < 3, then 


sfils) = / fra(t — Dat 


(o.@) 
< 2e*q"~? / h(t — 1)dt 
3 


< 2e7a"~* (3) 
< 2e*a"-*ash(s) 


< 2e*a""'sh(s). 
This completes the proof. 
Theorem 9.4 Fors > 1, the function 
. | 
F(s)=1+ So fils) (9.27) 
n=! "nod 2) 


1s continuous and differentiable, and 


F(s)=1+0O(e™*). 
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For s > 2, the function 


fis)=1- DO fils) (9.28) 


nad "(mod 2) 
is continuous and differentiable, and 
f(s)=1+0(e>). 
Proof. By Lemma 9.7, 
0 < f,(s) < 2e7a”"'h(s) < 2e*a"!eS 


for s > (3+(—1)")/2. Therefore, 
Lo, @) 
>, fals) Ke. 
n=] 


The theorem follows immediately from this inequality. 


9.4 The Jurkat—Richert theorem 


From now on, we shall consider only arithmetic functions g(d) that satisfy the 
linear sieve inequality (9.29). 


Lemma 9.8 Letz>2and1 < w < z. Let P bea set of primes, and let g(d) be 
a multiplicative function such that 


0 < g(p) <1 forallpeP 


and lo 
[[ a-spy" s K—— “Ee 
og u 


peP 
usp<z 


(9.29) 


for some K > 1 and allu such that 1 <u < z. Let 


V(z)=]| ]G — s(p)), 


peP 
p<z 


and let ® be a continuous, increasing function on the interval [w, z]. Then 


l 
Y= a(p)V(p)®(p) < (K — 1)V(z)(z) — KV(z) [ (u) a (7 ice? ). 


peP 
wsp<z 
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Proof. The step function 


Su) = >> a(p)V(p) 


is nonnegative and decreasing. By Lemma 9.2 and inequality (9.29), 


S(u) = V(u) — V(z) 
_(V@) _ 
7 a ) "@) 


= (TI (1 — g(p))7! - ; V(2) 


peP 


Let 

WS Pe < Pr-1 <°+'< pr <Z 
be all the primes in P that lie in the interval [w, z). Then S(p;,) = S(w), S(pi) = 
8(pi)V(p1), and S(u) = 0 for p; < u < z. By partial summation and integration 
by parts of the Riemann-Stieltjes integral, 


k 
>> 8(p)V(p)®(p) = > 8(p/)V (pO) 


i=] 
wSp<z 


k 
= > “(S(pi) — S(pi-1))®(p;) + S(p1) ®(p1) 
i=2 


k k-1 
= S>S(p)&(p:) — Y* Sp) (pias) 
i=] i=] 
k-1 
= S(pr)®(pe) + 9 > S(pi) (P(pi) — ®(pis1)) 


i=] 


= S(w)®(w) + S(px) (P(pe) — O(w)) 


k-1 
+) > S(pi) (®(pi) — PC pis) 
i=] 
= $(w)P(w) + [ S(u)d P(u) 
= S(w)P(w) + [ S(u)d ®(u) 


Ww 


= S(Z)@(z) — | P(u)dS(u) 


< (K —1)V(z)®(z) — KV(z) |  @(u)d & ) 


log u 
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This completes the proof. 


Theorem 9.5 Let z > 2, and let D be a real number such that D > z for n odd 
and D > z’ for n even, that is, 


log D 1 ifnis odd 
> 
— | 2° ifn is even. 


Let P be a set of primes, and let g(d) be a multiplicative function such that 
0 < g(p) <1 forallpeP 


and 
log z 


[[G-s@m)" <«k; 
ogu 


peP 
u<p<z 


for all u such that 1 < u < z, where the constant K satisfies 


1 
1<K <1+ 700" 
Then ; 
T,(D, z) < Vz) (ta +(K — 1) (a) es) , (9.30) 


Proof. We define the number 
tT=a+5(K —1)+1le® 


and the functions 


hn»(s) =(K — 1)t"e!°h(s) (9.31) 
for n > 1. Note that | 


a <t < 0.9607 + 0.0250 + 0.0037 = 0.9894 < as 


We shall prove that 
T,(D, z) < V(Z) fn(s) + An(s)) (9.32) 


This immediately implies (9.30) since h(s) < e° forall s > 1. 

The proof of (9.32) is by induction on n. Let n = 1. By Lemma 9.3 with 6 = 2, 
we have 7;(D, z) = Ofors > 3. Since the right side of inequality (9.32) is positive, 
it follows that the inequality holds for s > 3. If 1 < s < 3, then f,(s) = G/s) —1 
and 


T,(D, z) = V(D'*) — V(z) 
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by (9.13). It follows that 
T\(D,z) _ V(D") 


V(z)—si«éV(2+) 
- [|] G-apy'-1 
D'3<p<z 
<3K log z 4 
log D 
3K 
= —-_] 


ll 
——N 
|} We 
| 
—" 
Ne” 
+ 
wm | Ww 
o 
~~ 
| 
—" 
we’ 


< fils) +3(K — 1) 
< fils) +hy(s) 
since h(s) > e~> and t > 1le~®, hence 
h\(s) = (K — 1)te!°h(s) > (K — 1)11e7! > 3(K — 1). 


This proves the lemma for n = 1. 
Let n > 2, and assume that the lemma holds for n — 1. For n even and s > 2, 
or for n odd and s > 3, we define the function 


vfs) hes (Wi) 


for 1 < u < w. The function ®(u) is continuous, positive, and increasing. 
Moreover, 


B(Z) = fn-1(s — 1) + An-1(s — 1). 


It follows from the recursion formula (9.14), the induction hypothesis for n — 1, 
and Lemma 9.8 that 


D 
T,(D, Z) = Y > 8(P)Tn-1 (2. r) 


peP 
p<z 


og D ) (= )) 
V 7 1) +h, —] 
< 28) lea (> - thea (eo 


P<z 


= > a(p)V(p)®(p) 


peP 
pz 


= (K — 1)V(z)®(z) - Kve@) | ®(u)d 3) 
= (K = 1)V(2) (fr-als ~ 1) + hyals — 0) 


_ KV@) [ (u)d (ee?) 
S l log u 
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Sh DVO) GG SD) 
pA | GA) aide 


S 


+ 


where the last equation comes from substituting t = log D/ log wu in the integral. 
By (9.21), we have 


K OO 
=| fn-i(t — 1)dt = K f(s). 


Similarly, from the definition of H(s) and (9.25), we have 


[ h(t — 1)dt = H(s) < ash(s) 


and so 
K 06 
=| hn-\(t — 1)dt < aKh,_;(s). 
AY AY 
Since h(s — 1) < 4h(s) for s > 2, we have 
(K — 1)hy_\(s — 1) < 4K — I)hy-165) 
and 


(K — 1) fy—1(s — 1) < (K — 1)2e?a"~*h(s — 1) 
< 8e7(K — 1)a”"~*h(s) 


n—-1 
= 8e° (=) a (CK — 1)e!’r"—!h(s) 
T 
< 9e~*h,,_\(s) 


since 0 < a < t anda! < 9/8. Therefore, 


T,, (D, z) 
V(z) 


By Lemma 9.7 and definition (9.31), we have 


< Kf,(s)+ (aK +4(K — 1) +9e7*) hy_(s). 


(K —1)fn(s) < (K — 1)2e*a""'h(s) < 2e7Fhn_1(s), 


and so 
K f(s) < fa(s) + 2e*hy_1(s). 
Since 
aK =K —-(l1-—a)K < K —(1-—a)=(K — 1) +a, 
we have 


T,,(D, z) 
V(z) 


< fxs) + (a +5(K — 1) + 11e78) hy_i(s) 


= fn(s) + Thy-1(S) 
= fn(s) + An). 
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Let n > 3 be odd, and let 1 < s < 3. If z = D!”, then log D/logz = 3. By the 
recursion formula (9.15) and the same argument used above, we obtain 


D 
T,(D,z)= > g(p)Tn—1 (2. P) 


peP 
p<b!/3 


< >> g(p)V(p)®(p) 


peP 
p<pb!/3 


< (fn(3) + hn(3))V (Z) 
< (fn(s) + An(s))V(Z) 


since the functions f,,(s) and h(s) are decreasing. This completes the proof. 


Theorem 9.6 Let z, D,s,P,g(d), and K = 1+€ satisfy the hypotheses of 
Theorem 9.5. Let 


G(z,A*) = D7 A*(d)g(d). 


d|P(z) 
Then 
Gz, A*) < V(z) (F(s) +ee'**) 
and 
G(z,A~) > V(z) (f(s) — ee"), 
where F(s) and f(s) are the continuous functions defined by (9.27) and (9.28). 


Proof. We note that the sum of the following geometric series satisfies 


OO 99 n 
) (=) <51 <e’*. 
n=0 100 

n=O (mod 2) 


By (9.10) and Theorem 9.5, 


0o 
G(z,a*)=V(z)+ > T,(D,2) 
ne! "mod 2) 


<V(z){1+ > fils)tee SO (zs) 


n=] n= 
n=l (mod 2) n=l (mod 2) 


< V(z)(F(s) + ee"). 


Similarly, by (9.11) and Theorem 9.5, 


io, @) 


Giz,A-)=V(z)— >) T,(D, 2) 


n=O (mod 2) 
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00 oO 99 n 
>V@ti- So fils)-ee* SO (=) 


n=l n=] 


n=O (mod 2) n=O (mod 2) 
> V(z) (f(s) — ee"). 
This completes the proof. 
Theorem 9.7 (Jurkat—Richert) Let A = {a(n)}°°, be an arithmetic function such 
that 
a(n) > 0 forall n 
and 


|Al = y > a(n) < 00. 
n=} 


Let P be a set of prime numbers and, for z > 2, let 


P(z)=| |p. 


peP 
pe 


Let 


oo 


S(A,P,z)= ) > a(n). 


(1, P(z))=1 


For every n > 1, let g,(d) be a multiplicative function such that 
0 < g,(p) < 1 forall p €P. (9.33) 
Define r(d) by 7 x 
|Aal = ) a(n) = > a(n)g,(d) + r(d). 


n=l n=] 
d|n 


Let QO be a finite subset of P, and let Q be the product of the primes in Q. Suppose 
that, for some € satisfying 0 < € < 1/200, the inequality 


] 
[] G-an(py! < +0) = (9.34) 
peP\Q 0g Uu 


holds for alln and 1 <u < z. Then for any D > z there is the upper bound 
S(A, P, z) < (F(s) + ee!**)X +R, (9.35) 
and for any D > 2” there is the lower bound 
S(A, P, z) > (f(s) — €e'**)X — R, (9.36) 
where 
log D 
5=——, 
log z 
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f(s) and F(s) are the continuous functions defined by (9.27) and (9.28), 


o @) 
X=) a(n) [] d—- an(p)). (9.37) 
n=] p|P(z) 
and the remainder term is 
R= > Ir(d)|. 


d| P(z) 
d<DQ 


If there is a multiplicative function g(d) such that g,(d) = g(d) for all n, then 
X = V(z)|Al, (9.38) 


where 


Vid= [] d—a(p)). 


p|P(z) 


Proof. Let P; = P \ Q. By Theorem 9.3, there exist upper and lower bound 
sieves A*(d) with sieving range P; and support level D, and with |A*(d)| < 1 for 
all d > 1. We define 


Gn(z,a*)= S > af@)gn(d) 
P| Pi (2) 


and 


Vi(z)= [] CG — gn(p)). 


P| Pi(z) 
Since ?; and Q are disjoint sets of primes, we have 
] | G - sn(p)) = Vaz) |] Gd - 2n(@)). 
P|P(2) q\Q(z) 


By Theorem 9.6, 
Gn(z,2*) < Vn(z) (F(s) + €e!**) 


and 
Gn(z,A~) > Va(z) (F(s) — €e'**). 


It follows from Theorem 9.2 that 


S(A,P, z) < )-a(n)Galz, at) [] A — eng) +R 


n=] q| Q(z) 
< (F(s)+€e'**) } a(n)Vn(z) [] GQ — gn(q)) +R 
n=] q\ Q(z) 
= (F(s)+ee'**) a(n) |] (1 — gn(p)) +R 
n=] P| P(z) 


= (F(s)+ee!**)X + R. 


The lower bound is obtained similarly. This completes the proof. 


9.5  Differential-difference equations 
9.5  Dijfferential-difference equations 


In this section, we shall compute initial values for the functions 


1@,@) 
F(s)=1+ fxs) fors>1 
n=] "nod 2) 
and x0 
fis)=1- > fils) — fors > 2. 
n=O "nod 2) 
We shall prove that 
2eY 
F(s) = — forl<s <3 
s 
and 


for2<s <4, 


f(s) 


where y is Euler’s constant. We define f(s) =0 for 1 <s <2. 


_ 2e” log(s — 1) 
5 


Lemma 9.9 
sF(s) = 3F(3) 


forl<s <3. 


Proof. Let 1 < s < 3. By Lemma 9.5, 
sfn(s) = 3 f,(3) for all odd n > 3. 


Since 
s+sfi(s) =3 
by (9.20), it follows that 
o, 
sF(s)=s+sf\(s)+ > sf,(S) 
n=l od 2) 
o,@) 
=3+ > 3f,(3) 
n=] od 2) 
= 3F (3), 


which completes the proof. 
Define the constants A and B by 


A =SF(s) forl<s <3 


and 
B=2f (2). 
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Lemma 9.10 The functions F(s) and f(s) are solutions of the system of differ- 
ential-difference equations 


(sF(s)=f(s—1) fors >3 
(sf(s)) =F(s—1)  fors > 2. 


Proof. Let n > 2. By Lemma 9.5, for n odd and s > 3, or form even ands > 2, 
we have 


Sfn(S) -| fn-1(t — 1)dt 


and so 
(sfn(s))’ = —fn-1(s —_ 1). 
For s > 3, we have s/;(s) = 0 and so 


/ 
‘©, @) 


S+ > sfn(s) 


n=l 


n=l (mod 2) 


(sF(s)y 


/ 
Le, ¢) 


S+ > Sfn(S) 


n=3 
n=l (mod 2) 


o,@) 
=1—- YO fr-ls- 1) 
n=] "mod 2) 
CO 


=1- DO fils—1) 


n=2 
n=O (mod 2) 


= f(s — 1). 
Similarly, for s > 2 we have 


(o.@) 


(sf(s)=|s— > sfils) 


n=2 
n=0 (mod 2) 


(o,@) 


=1+ DO fr-ils—1) 


n=2 
n=0Q (mod 2) 


1+) fG6-1) 


n=! 
n=l (mod 2) 


= F(s — 1). 


This completes the proof. 
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Lemma 9.11 Fors > 2, let 

P(s) = F(s) + f(s) 
and 

QO(s) = F(s) — f(s). 


For s > 3, the functions P(s) and Q(s) are the unique solutions of the differential- 
difference equations 
sP'(s) = —P(s) + P(s — 1) (9.39) 


and 
sQ'(s) = —Q(s) — Q(s — 1) (9.40) 


that satisfy the initial conditions 
sP(s)=A+B+Alog(s — 1) 


and 
sQ(s)=A— B-— Alog(s — 1) 


for 2 < s < 3. Moreover, 
P(s) =2+ O(e*) 


and 


O(s) = O(e”). 


Proof. Since 
SF(s)=A forl <s <3, 


it follows that A 
F(s) = — forl<s <3 
Ss 


or, equivalently, that 


A 
F(s — I)= — for2<s <4. 
S— 


Since (sf(s))’ = F(s — 1) for s > 2, it follows that 
° A 
sf(s) =2f(2) +| ra = B+ Alog(s — 1) 
>t 
for 2 < s < 4. Since 


sF(s)=A forl<s <3, 


it follows that 
sP(s)=A+B+Alog(s — 1) (9.41) 


and 
sQ(s)=A— B-— Alog(s — 1) (9.42) 
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for 2 < s < 3. Fors > 3, we have 
(sP(s))' = (sF(s))’ + (sf(s))’ = f(s — 1) + F(s — 1) = P(s — 1), 


and so 
sP'(s) = —P(s) + P(s — 1). 


Similarly, 
(sQ(s))' = (sF(s))' — (sf(s)y = f(s — 1) — F(s — 1) = —Q(s — 1) 


and so 
sQ’(s) = —Q(s) — Q(s — 1). 


By Theorem 9.4, we have F(s) = 1 + O(e~‘) and f(s) = 1+ O(e7*), and so 
P(s) =2+ O(e*) and Q(s) = O(e~*). This completes the proof. 
The differential-difference equations (9.39) and (9.40) are of the form 


sR'(s) = —aR(s) — bR(s — 1). (9.43) 
Associated with this equation is the adjoint equation 
(sr(s))’ = ar(s) + br(s +1). (9.44) 


To every solution R(s) of equation (9.43) and every solution r(s) of equation (9.44), 
we associate the function 


5 
(R(s), r(s)) =sR(s)r(s) — >| R(x)r(x + ldx 
s—l 
for s > 3. Differentiating with respect to s, we obtain 


- (R(s), r(s)) 

= R(s)r(s) + sR'(s)r(s) + sR(s)r'(s) — bR(s)r(s + 1) + bR(s — 1)r(s) 
= (sR'(s) + bR(s — 1))r(s) + (r(s) + sr'(s) — br(s + 1))R(s) 

= —aR(s)r(s) +aR(s)r(s) 

= (). 


Therefore, (R(s), r(s)) is constant for s > 3. 
The equation adjoint to (9.40) is 


(sq(s)) = q(s) + q(s +1) 


or, equivalently, 
sq'(s) =q(s + 1). 
This has the solution 
q(s)=s—1. 
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Clearly, 
q(s)~s 


as s tends to infinity, and 
q(1) = 0. 


Since Q(s) = O(e~*), it follows that 


sQ(s)q(s) =O (s*e~*) = 0(1) 


[ O(x)q(x + 1)dx = o(1). 
s—l 


Therefore, 
iim (Q(s), q(s)) = 0. 


Since (Q(s), q(s)) is constant for s > 3, it follows that 


(Q(s), q(s)) = 0 
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for s > 3. This implies that B = 0, since (x Q(x))’ = —(x — 1)~! by (9.42), and 


0 = (0(3), g3)) 
3 
= 30(3)4(3) — | O(x)g(x + Idx 


3 
= 30(3)q(3) — xO(x)q'(x)dx 


3 
= 30(3)q(3) — [xO@)q@) + | (x O(x))'q(x)dx 


3 
= 20(2)q(2) — A | I) ay 
2 x-—1 
=(A—B)—A 
= B. 


Similarly, the equation adjoint to (9.39) is 
(sp(s)) = p(s) — p(s +1) 


or, equivalently, 
sp(s) = —p(s + 1). 


For s > 0, we introduce the function 


P(s) = | . exp(—sx — I(x))dx, 
0 


(9.45) 


(9.46) 
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where x 
I(x) = | (1 —e yt! dt. 
0 

Since 

l—e™ 

O< <1 fort > 0, 
we have 
O<I(x)<x for x > 0, 

and so 


exp(—(s + 1)x) < exp(—sx — I(x)) < exp(—sx). 
Therefore, the integral converges for all s > 0, and 


] 
s+l 


= [ exp (—(s + 1)x)dx < p(s) < [ exp (—sx)dx = -. 
0 0 


It follows that 
sp(s)~ 1 


as s tends to infinity. Using integration by parts and the observation that 
xI'(x)=1-—e™, 


we obtain 
OO 
sp'(s) = -| sx exp(—sx — I(x)) dx 
0 


= [ (+ exp(—sx)} x exp(—I(x))dx 
0 


= [x exp(—sx — I(x))] — [ exp(—sx) (f+ exp(—1(x))] dx 
— [ exp(—sx)(1 — xI’(x)) exp(—I(x))dx 

0 
= — | * exp(—sx) exp(—x) exp(—I(x))dx 

0 


= — [ exp(—(s + 1)x — I(x))dx 
0 
= —p(s +1). 


This proves that p(s) is a solution to the adjoint equation (9.45) for all s > 0. 
We shall prove that 
pil) =e". 


We need the following integral representation for Euler’s constant: 


1 oe) 
y= | (l—e')t dt — | e'r dt (9.47) 
0 1 
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(see Exercise 16 and Gradshteyn and Ryzhik [42, page 956]). Then 
I(x) = | (1—e™')t'dt 
0 
1 x 
-/ (1 — eye tars | (1—e')t7'dt 
0 1 
1 x 
-| (1—e')t7'dt -| e't "dt + logx 
0 1 
1 fore) oO 
-| (i — e\t dt -| et 'dt +/ et 'dt + log x 
0 1 x 
1o,©) 
=y +/ e't dt +logx. 
It follows that 


—sp'(s) = [ sx exp(—sx — I(x))dx 
0 


co co 
=e! | s exp (-sx — | era) dx 
0 x 
Oo lo, @) 
=e! | exp (-u — | ed) du. 
0 u/s 


oO 
lim e't—'dt =0, 


s—>0t u/s 


For u > 0, we have 


and so 
pQ) = lim, p(s +1) 
=— lim, sp’(s) 
oo oo 
=e’ lim exp (-» — / erat) du 
s+0* Jo u/s 
1o,@) oo 
= ey | lim exp (-» -| erat) du 
0 s—>0t u/s 
oO 
=e! | exp(—u)du 
0 
=e’, 


Since P(s) = 2 + O(e~*) and sp(s) ~ 1, it follows that 


lim (P(s), p(s)) = lim (sPisrpi + [ P(x)p(x + Dds = 2. 
s—>00 S—>0O s—1 
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Since (P(s), p(s)) is constant for s > 3, it follows that 
(P(s), p(s)) = 2 

for all s > 3. Letting B = 0 in (9.41), we have 

sP(s)=A+Alog(s — 1) 

and A 

(s P(s))’ = —— 
s—1l 
for 2 < s < 3. Therefore, 2P(2) = A and 
2 = (P(3), p(3)) 


3 
= 3 P(3)p(3) +/ P(x) p(x + 1)dx 
2 
3 
= 3P(3)p(3) — i xP(x)p'(x)dx 
2 


3 
= 3P(3)p(3) — [x P(x) p(x) iS + | (x P(x))' p(x)dx 


3 
= 2P(2)p(2)+A / PO ay 
2x - 1 


p(x) dx 


x-—1 


3 
= Apa)+a | 
2 


3 
= Ap(2) — a | p(x — 1)dx 
2 
= Ap(2) — Ap(2) + Ap(1) 


= Ae ”. 


This proves that 
A=2e”. 


We can now determine the initial values of F(s) and f(s). 
Theorem 9.8 
2eY 
F(s) = — foril<s <3 
S 


and 2e¥ log(s — 1) 
_ 6 Nats ~ 1) for2<s <4, 


f(s) 


where y is Euler’s constant. 
Proof. Let 2 < s < 3, and let A = 2e” and B = 0 in (9.41) and (9.42). Then 


sP(s) = 2e” + 2e” log(s — 1) 
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and 
sQ(s) = 2e” — 2e” log(s — 1). 
Therefore, 
P 
sF(s) = mse = Qe”. 


By Lemma 9.9, s F(s) is constant for 1 < s < 3 and so 
sF(s) = 2e” forl <s <3. 


By Lemma 9.10, we have (sf(s))’ = F(s — 1) fors > 2 and so 


sf(s) = 2F2)+ | Fe tat =f dt = 26” log(s — 1) 
2 2 §— 


for 2 < s < 4. This completes the proof. 


9.6 Notes 


The material in this chapter is based on unpublished lecture notes of Henryk 
Iwaniec([68]. See Jurkat and Richert [69] for the original proof of Theorem 9.7. 
Standard references on sieve methods are the monographs of Halberstam and 


Richert [44] and Motohashi [87]. 


9.7 Exercises 


1. Let P be the product of the primes up to ./x. Prove Legendre’s formula 
m(x) — 1(/x) +1 


“Zs & lel, Zlagal 


D<PikJE Pi P2 p3<po<Pi<Ji Pi P2P3 


2. Let P be the product of the primes up to ,/x. Prove Sylvester’s formula 


& oth Dao [5] (S]) 


VE<PSX 


3. Let A; = {a;(m)} and Az = {a2(n)} be arithmetic functions such that a,(n) < 


az(n) for all n > 1. Prove that 


S(A1, P, 2) < S(A2, P, 2). 
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10. 


. Let Ag = {ae(n)} be a nonnegative arithmetic function for £ = 1, 


9. The linear sieve 


let A = {a(n)} be the arithmetic function defined by a(n) = a;(n)+ 
for all n. Prove that 


k 
S(A,P,2z)= > S(Ab, P, 2). 


t=] 


. Let 2 < w < z. Prove Buchstab’s identity: 


S(A, P,z)=S(A,P,w)— > S(Ap,P, p). 


WSp<z 


In particular, 
S(A, P, z)=|A] — )) S(Ap, P, p). 


p<z 


. By iterating the Buchstab identity, prove that, for z; < z, 


S(A,P,z) <|IAI— D> 1Ap, t+ > lApipl 


Pi <2] P2<P1 <2) 


~ > S(Ap, pops »P, P3). 


P3<P2<P1<Z 


...,k, and 


-tay(n) 


. Let P be aset of primes, and let A*(d) be upper and lower bound sieves with 


sieving range P and support level D. Let P, be a subset of P. We define 


functions Ay (d) by Aj (d) = A*(d) if d is divisible only by primes 


in Pi, and 


A=(d) = ( otherwise. Prove that A¥(d) are upper and lower bound sieves 


with sieving range ?, and support level D. 


. Let h(s) be the function defined by 9.23. Prove that 


h(s — 1) < 4h(s) for s > 2. 


. Use the recurrence relation 


sfr(s) = | filt — Lat 


to prove that 
sfo(s) =s — 3log(s — 1)+3log3 — 4 
for2<s < 4. 
Prove that 
9x < Io 9 
9x -—1 7 6 8 


f(x) = x log 


for x > 1. Hint: Show that the function f(x) is decreasing for x > 1. 


11. 


12. 


13. 


14. 


15. 
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Let Q(s) be a continuous function on the interval [1, 2]. Prove that there ex- 
ists a unique continuous function Q(s) defined for alls > 1 that satisfies this 
initial condition and that is a solution of the differential-difference equation 


sQ'(s) = —Q(s) — Q(s — 1) 


for all s > 2. Hint: For 2 < s < 3, we must have 
sQ(s) = -| O(x — 1)dx +2Q(2). 
2 


Similarly, for 3 < s < 4, we must have 


sQ(s) =— [ Q(x — 1)dx +3Q(3). 
3 


The proof proceeds by induction. 


Let Q(s) be the function defined in Lemma 9.11. Prove that 
s(s — 1)Q(s) = | x O(x)dx 
s—l 


for all s > 3. Prove that 


0<sO(s)<s”. 


Let P, and P2 be disjoint sets of prime numbers, and let f; and f> be 
arithmetic functions such that f;(d) + 0 only if d is a product of primes 
belonging to P; and f2(d) #0 only if d is a product of primes belonging to 
Pz. Let f = fi; * fz. Prove that 


l* f =(1* fi)C * fo). 


Let A;(d) and A>(d) be upper bound sieves with support levels D, and D2, 
respectively, and with disjoint sieving ranges ?; and P2. Let A* + (d) be the 
convolution of Aj(d) and 45(d), that is, 


At(d) =Afxag@ = > Atd)aAz@). 
d=d\d) 
Prove that A* is an upper bound sieve with support level D = Dj, D> and 


sieving range P, U P>. 


Let Aj(d) and A3(d) be upper bound sieves with support levels D, and D2, 
respectively, and with disjoint sieving ranges P; and P2, and let A; (d) and 
i, (d) be lower bound sieves with support levels D; and D2, respectively, 
and with disjoint sieving ranges P; and P2. Prove that 


AT (d) = Ay *AS(d) — AT * AS) 4+ AT eA @ 


Prove that A~ is a lower bound sieve with support level D = Dj, D2 and 
sieving range P; U P>. 
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16. In the theory of the Gamma function, it is proved that 


oo 
—y =T(1) -| e~* log xdx. 
0 


From this formula, use integration by parts to obtain (9.47): 


] (ore) 
Y -/ (1—e")t'dt — / e't dt. 
0 1 


10 


Chen’s theorem > 


Is it even true that every even n is the sum of 2 primes? To show this 
seems to transcend our present mathematical powers. ... The prime 
numbers remain very elusive fellows. 


H. Weyl [142] 


10.1. Primes and almost primes 


In this chapter, we shall prove one of the most famous results in additive prime 
number theory: Chen’s theorem that every sufficiently large even integer can be 
written as the sum of an odd prime and a number that is either prime or the product 
of two primes. An integer that is the product of at most r not necessarily distinct 
prime numbers is called an almost prime of order r, denoted P,, and so Chen’s 
theorem can be written in the form 


N=pt+P, 


for every sufficiently large even integer NV. We shall prove not only that every large 
even integer N has at least one representation as the sum of a prime and an almost 
prime of order two but that there are, in fact, many such representations. 


Theorem 10.1 (Chen) Letr(N) denote the number of representations of N in the 
form 
N=ptn, 
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where p is an odd prime and n is the product of at most two primes. Then 


2N 
r(N) > SO) ioe Ne (10.1) 
where | | 
P — 
G(N) = 1 — —————_ , 10.2 
IT ox) p-—2 


p>2 


The number G(N) is called the singular series for the Goldbach conjecture. 

The proof has two ingredients. The first is the Jurkat—Richert theorem (Theo- 
rem 9.7), which gives upper and lower bounds for the linear sieve. The second 
is the Bombieri—Vinogradov theorem, which describes the average distribution of 
prime numbers in arithmetic progressions. Throughout this chapter, p and g denote 
prime numbers. 


10.2. Weights 


Let N be an even integer, N > 4°. We begin by assigning a weight w(n) to every 
positive integer n. Let 


z= NN" (10.3) 
and 

y=Nn”. (10.4) 
Then z > 4. We define 

1 1 
w(n) = 1— 5 > k— 5 De, 1. (10.5) 

gk i\n ZSP| <YSP2SpP3 
Clearly, 

w(n) < 1 


for alln, and w(n) = 1 if and only if 7 1s divisible by no prime in the interval [z, y). 
Let P be the set of prime numbers that do not divide N. Then 2 ¢ P since N is 


even. Let 
P(z)=[ |p. 


peP 
p< 
Let n be a positive integer such that 
n<N and (n,N)=(n, P(z))=1. 


Then n is divisible only by primes p > z that do not divide N. Ifn = p,p2::- 
Pr Pr+i*** Pr+s, Where 


ZS Pi f°°+ SS Pr < YS Prat S++ S Press 
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then 
NS? = y’ S Pr+i+** Pres LN <WN 


and so s = 0, 1, or 2. Suppose that w(”) > 0. Since 


] r 

_ k=-, 

3 Lo 3 
it follows that r = 0 or 1. If = 1 and s = 2, then n = p, p2p3, where z < p; < 
y < p2 < p3, and so w(n) = O. Therefore, if w(n) > O, then either r = 0 and 
s =0,1, or 2, orr = 1 ands =0or 1. In all of these cases, r +s < 2. Therefore, 
if (n, N) = (n, P(z)) = 1 and w{n) > 0, then either n = 1 or n is an integer of the 
form p, Or Pp; P2, where p; and p> are primes > z that do not divide N. 

Consider the set 
A={N—p:p<N,péP}. (10.6) 


Then A is a finite set of positive integers, and |A| = 2(N) — w(N), where w(N) 
denotes the number of distinct prime divisors of N. If n = N — p € A and 
if (n, N) > 1, then p divides N and so p ¢ P, which is absurd. Therefore, 
(n, N) = 1 for all n € A. We obtain a lower bound for r(N) as follows. 


r(Ny> Sb 1 


N=p+n 
né{l,p).P| P2:P}.P22z} 


> 1 


neA 
né{l,p).Py P2:P}.P222} 


_ >> l 


neA 
(n, P(2))=1 
né{l.py.Py P2:P1.P222} 


> w(n) 


neA 
(n, P(z))=1 
n€{],py.Py P2:P|.P22=2} 


IV 


IV 


> y w(n) 
neA 
(1, P(z))=1 
] ] 
= > J1-=)ok-= 1 
néA 2 z<q<y 2 P| P2P3~" 
(n, P(z))=1 gk \|n 2SP) <YSP2SP3 
] 
= 5 | — y } y } k 
néeA 2 neA 2Sq<y 
(n, P(z))=1 (n,P(z))=1 gk In 
] 
--{ )° 1 
2 néeA Py Pap3z=n 


(n, P(z))=1_ 2SP) <YSP2SP3 
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We shall express these three sums as sieving functions. If we let A = {a(n)}°° 


n=] 
be the characteristic function of the finite set A, then the first sum becomes simply 


> 1 = > a(n) = S(A, P, z). 


neA n, P(z)=1 
(n, P(z))=1 ( @) 


We divide the second sum into two pieces: 


dX Lk=| dX de}t] de Le-Y 


néeA zSq<y néeA 2<q<y néeA zSq<y 
(n,P(z))=1 gk yn (n,P(z)=1 gin (n,P(z))=1 gk |\n 
k>2 


The first piece can be expressed as a sieving function as follows: For every prime 
q, let Ag = {ag(n)}P2, be the arithmetic function defined by 


n=] 
ag (n) = | 


Since (n, N) = 1 for alln € A, we have q € P if a,(n) = 1, and 


yD Di= de DY al) 


1 ifne Aandg|n 
0 otherwise. 


neA = 254¢<y z<q<y (n, P(z))= 
(n, Pla) q\n <9 <y (n, P(z))=1 
~ » S(Ag ? P, Z). 
z<q<y 


It is easy to estimate the second piece. Since z = N'/8 > 4 and 
Sk 1 1 
d ge qt 

we have 


YY Le-v- OY Yep 


neA zsq<y z<q<y k=2 neA 
(n,P(z))=1 gk iin q<Y (n, P(z))=1 
k>2 


gk |In 


< © YYE- 


z<q<y k=2 n<N 
q* |ln 


i | 
<N Yves 


Z7<q<y k=2 q 


1 
” » (q — 1) 


z<q< 
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For the third sum, we let B be the set of all positive integers of the form 


N — pip2p3;, 


where the primes p}, p2, p3 Satisfy the conditions 
Z5Pi<YSP2LP3 


Pip2p3 < N 


(pi p2p3, N) = 1. 
Let B = {b(n)}®%, be the characteristic function of the finite set B. An element 


n=] 


of B is a prime p if and only if p < N and N — p = p,pop3 € A, where 
Z< pi < y < P2 < p3. Therefore, 


neA P| P2P3=" P| P2P3€A 
(n, P(z))=l 27S P1) <Y¥YSP22P3 ZSpP}<YSP2EP3 
= y | = y 1+ ) ] 
peB peB peB 
p<y p2y 


<y+) 1 


peB 
p2y 


<yt+ D1 
in. PON 
=y+t+ > b(n) 
(n, P(y))=1 


= N'/? + S(B,P, y). 
We now have a lower bound for r(NV) in terms of sieving functions. 


Theorem 10.2 
I 1 7/8 1/3 
r(N) > S(A,P,2)— 5 )) S(Aq, P, 2) — 5S(B, P, y) — 2N8 — NM? 
z39<y 


We shall obtain a lower bound for S(A, P, z) and upper bounds for >> q Ss (Ag, P, 
z) and S(B, ?, y). 


10.3. Prolegomena to sieving 


In applying the linear sieve to estimate the three sieving functions, we choose the 
multiplicative function 


1 
d) = 2,(d) = —— 
g(d) = g,(d) o(d) 
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for all n > 1. Since N is even, we have 2 ¢ P and 
1 
0<g(p)=——— <1 for all p € P, 
PD — 


so the functions g(d) satisfy (9.33). To establish inequality (9.34), we apply 
Theorem 6.9, which says that there exists a number u)(e) such that 


1\7! l 
Il (1 = | < (1+8/3)—~— 
u<p<z Dp log u 
for any u;(€) < u < z. Also, there exists u2(e) such that 


(p — 1)” ( 1 ) E 
Af? 1+——_ 1+. 
I] p(p — 2) I] (p-a) ~ 73 


p2ur(e) p2u2(e) 


since the infinite product converges. Therefore, for 
Uu > Uo(é) = max(u;(€), u2(E)) 


we have 


I] ¢-ew = T] G-) 


u<p<z u<p<z p-\ 
— 1) 1\7! 
== I] @Iy I] (: _ ~) 
uxp<z P(p _ 2) usp<z Pp 


< (1+6/3 
logu 


] 
<(lte)——. 
logu 


Let O(e) be the set of all primes p < uo(e), and let O = PN Q(e). This gives (9.34). 
Let Q(e) be the product of the primes in Q(¢), and let Q be the product of the 
primes in Q. Then Q(e) depends only on ¢, not on N, and so 


Q < Oe) < logN (10.7) 
for all sufficiently large integers N. 
Theorem 10.3 Let N be an even positive integer, and let 


1 
vV@= [[a-s@)-= J] (1-—) (10.8) 


pxz 
p|P(2) ont, 


Then 


eY 1 
V(z) = CN) Oez ( +O (<x) 
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where | 1 
Pp — 
G(N) = (1 - ) a 
I] (p — 1) pw P—2 
Proof. Let | 
W(z) = (1 — <4) ; 
it P —1 
Then 
V@ _ (: _ so) 
W(z) 2<p<z Pp —1 
p|N 
5) W(-55) 
= ] — ——— ] — ——- 
p>2 ( Pp _ :) I] Pp —1 
pin p|N 
] 1 
TEI 55) 
p>2 Pp 2 p2z 14 _ l 
p|IN pIN 


Thus, 
V — 
V() = p-l 1+ O log N . 
W(z) 3.7 p-2 N1/8 


piN 


To estimate W(z), we see that 


1\7! 1 1\7! 
wer] (1~>) - IL-5) O(-5) 
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p(p — 2) 
2<p<z (p _ 1)? 


-2 TT (1- oy) 
1 
211 (1-5 o-w)U(0* 5-3): 


Since 1+x < e* < 1+2x for0 < x < log?, it follows that 


1+ 
(+ = =D) <e0 (OD P(p — an) 
< exp (x _ aos] 
<e0(s5) 
1 
sea() 
2 
<1+- 


By Mertens’s formula (Theorem 6.8), we obtain 


we 211 (1-% >) (+0())MC-5) 
- TT (1-g —) (1 +0 (<)) ne (: re (sez) 
T1(- 9) Se (olgtn)) 


Therefore, 


ee 
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10.4 A lower bound for S(A, ?, z) 


Theorem 10.4 


NV(z) 
log N- 


e” log 3 
2 


S(A, P, z) > ( + 1) 


Proof. We shall apply the linear sieve and results about the distribution of prime 
numbers in arithmetic progressions to obtain a lower bound for the sieving function 
S(A, P, z). We use the prime number theorem in the form 


N ] 
x iy(+20(as)) 


lAl= Do 1 
p<N 


Then 


= (N) — w(N) 
= 7(N)+ O(log N) 


N 1 
os 1+0O . 
log N ( (<x) 


In the Jurkat—Richert theorem, the main term in the lower bound (9.36) is f(s)X, 


where 
X = V(z)|A| = V(z) N 1+0O 
TM os log N log N 


and V(z) is defined by (10.8). 
The remainder term in the Jurkat—Richert theorem is 


R=) |r@)l, 


d<QD 
d|P(z) 


where 


r(d) =|Aal — ) a(n)g(d) = |Aal — oa) (10.9) 


We want to obtain 


R —_— 
“ Gogny 

with D = D(N) as large as possible. We want D large because the function f(s) 

in the lower bound of the Jurkat—Richert theorem is an increasing function of 

s = log D/ log z for 2 < s < 4. We have 


|Aa| = )_ a(n) 


n=] 
d\n 
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oO 


y> 1 


N—-peA 
N-p=0 (mod d) 


yo 1 


peP 


p<N 
p=N_ (mod d) 


Y> 1+ O(@(N)) 
pun’ (mod d) 


uw(N;d, N)+ O(log N), 


where the term w(N) appears when we include the primes that divide N. Therefore, 


|A| 
d) = |Aq| - —— 
r(d) = |Aq| y(d) 


= 7(N;d,N) — Z(N) 
p(d) 
= 6(N;d, N)+ O(log N), 


+ O(log N) 


where 
6(x;d,a)=7(x;d,a) — m(x) 
g(a) 
for x > 2,d > 1, and (d,a) = 1. There are two important results that provide 
estimates for 5(x; d, a). The Siegel-Walfisz theorem states that 


x 
6(x;d,a)< (logayA 

for any positive number A, where the implied constant depends only on A. This 
result is useful if the modulus d is not too large, say, d < (log x)*. The Bombieri— 
Vinogradov theorem tells us about the average distribution of primes in congruence 
classes over a large set of moduli. It states that, for every A > 0, there exists a 
positive number B(A) such that 


x 
max |d(x;d,a)|< ———— 
p> 4) dal (log x)4 
for 
3/2 
D(A) = ————-., 
4 (log x) 34) 


where the implied constant depends only on A. 

We shall apply the Bombieri—Vinogradov theorem with x = a = N and A = 3. 
Let 
D(3) Ni? 


D= = ——___, 
log N = (log N)8@)+! 
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Then D > z* = N'/*, Since Q < Q(e) < log N for N > N(e), we have 
1/2 


D < ——___ 
ed«< (log N)8@) 


= D(3) 


and 

Ni? ee 
(log N)3@)-! "(log N° 
for N sufficiently large. Therefore, 


R= ) Ir(@)| 


d<QD 
d|P(z) 


< Do Ir@| 


d<QD 
(d,N)=1 


< >> |8(N;d, N)|+ QDlogN 
ane 


N 
< |5(N; d, N)| + ——— 
> (log N)3 


(d,N)=1 


< a 
(log 


Now we apply the Jurkat—Richert theorem (Theorem 9.7) with z = N!/8 and N 
sufficiently large. We have 


_ log D -4 8(B(3) — 1)) log log N 


QODiogN < 


e [3,4 
log z log N DB, 4] 
and so 
2e” log(s —1)_ e” log3 log log N eY log 3 
= ES hv+F —— = O . 
f(s) s 2 ( log N 2 +O) 
Therefore, 


S(A, P, z) > (f(s) — ee'**)X — R 


11 N I _* 
> (f(s) — 8€" WOE (1 +O (<x)) +0 (a ws) 


Y log3 N 
> (2108? | oy) AYO 
2 log N 
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Theorem 10.5 


+ 0.) NV) 


S(Aq, P, 
> (Ag, P, z) < jog N 


75q<y 


e” log 6 
2 
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Proof. We shall apply the Jurkat—Richert theorem again to get an upper bound 
for S(A,, P, z), where g is aprime number such that z <q < y.lfn=N—peA 
and q divides both n and N, then q = p, which is impossible since the prime p 
does not divide N. Therefore, |A,| = 0 if g divides N, so we can assume that 
q ,N)=1. 

Again we choose g(d) = g,(d) = 1/g(d) for all n, so inequalities (9.33) 
and (9.34) are satisfied. The error term r,(d) is defined by 


lAgl 
yd) 


Let d divide P(z). Since d is a product of primes strictly less than z, it follows that 
(q, d) = 1 for every prime number q > z, and so 


(Ag)al = ) a(n) = ) a(n) = |Agal. 


rq(d) = |(Ag)al — 


ain qd\n 
Then 
A 
r)(d) = |Agal — a 
Aca ~ 4 [Al |Agl 
gad) g(qd) 9d) 
r(q) 
“r9d) — Ca 


where r(qd) and r(q) are error terms of the form (10.9). Let 


_ D(4) _ Ni/2 
~ logN (log N)3(4)+! 


and 


D 
D, = —. 
q 


Then D, > D/z > z. The remainder term for S(A,, P, z) is 


Ry= DY ia@l< D7) adi+r@) Dd) a 


d<QDg d<QDg d<QDg 
d| P(z) d| P(z) d| P(z) 


From Theorem 9.7, we have the upper bound 
S(Aq, P, z) < (F(sq) + e'*) |Ag|V(z) + Rg, 
where 


; _ log Dg 
4 Jogz | 
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We do not estimate the main term and the remainder term for individual primes q. 
Instead, summing over z < q < y, we obtain 


S> S(Ags Pz) < D> (F(sq) +e") |Ag Vz) + RB’, 


z<q<y i<q<y 
(q.N)=1 (q.N 1 
where 
/ 
R=) R, 
zsq<y 
(q.N)=1 
<)>) Do lr@di+ dD r@ do a 
zZSq<y d<QD/q zSq<y d<QD/q p( ) 
(qQ.N)=l  d{P(z) (q.N)=1 d\P(z) 
l 
< )r@ i+ Do ir@l DY a 
d’<QD z<q<y d<N'/2 p( ) 
(d’,N)=1 (q.N)=| 


and QD < D(4). Applying the Bombieri—Vinogradov theorem as in the previous 
section, we obtain 


Y Ir@) < D> 18Nsd', N+ > Odog N) 


d'’<QD d’<DQ d'<DQ 
(d’,N)=1 (d’,N)=1 (d’,N)=1 
< al 
(log N)* 


Since y = N'/3 < D < QD for sufficiently large N, we also have 


> Ir@l« — 


zsq<y 
(q,N)=1 


N 
(log N)4 


By Theorem A.17, 


l 
) —.~ <logN 
d<n ¥ g(a) 


and so 


R’ —__—-—_, 
S Gog Ny 


Next, we estimate the main term. We have 


_logD/q _ 8log(N'/*/q) _ 8(B(4) +1) loglog N 


log z log N log N 
Since N!/8 = z <q < y=N!",, it follows that 


8 log(N'/?/q) - 


; 10.1 
log N ~ ( 0) 


4 
-< 
3 
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and so 1 < sy < 3. By Theorem 9.8, F(s) = 2e’ /s for 1 < s < 3. Therefore, 


F(s,) = ders e¥ log N log log N 
Sq ~~ 4log(N1/2/q) log N 


and so oe N 
14 _  €& 108 
F (Sg) + €e€ = 41og(N'/2/q) + O(e). (10.11) 


Also, 


|Ag| =7(N;q, N) + O(log N) 
_ m(N) 
~(q) 


N 1 
~ D@ylogN (1+0 (ign) #800. 


>, (F (sq) + ee") Ag! 


zaq<y 
(q,N)=1 


Y log N N 1 
-> Gena a +0) —____ (: +O (<w)) 
<r) 4log(N"/7/q) y(q) log N log N 


(q,N)=l 


+ >> (F(sq) + e!*)5(N;q, N) 
GN 


eYN 1 
4 £2 9(q) log(N'/2/q) 


(¢,N)=1 


+ 0(N;q, N) + O(log N) 


Therefore, 


N 1 
o (| —— > ———_.,—~ 
(aa ) £~. v(q) log(N'/2/q) 
)=1 


(q.N 
eN l 
0 fa — +0 | Y~ 8(N:q,N) 
(oo »» e(q) dX 
(q.N)=1 (q,N)=1 


It is not difficult to evaluate these terms. By the Bombieri—Vinogradov theorem 
again, we have 


N 
2, (Nia, N) =O (a we) : 


(q,.N)=! 


By Theorem 6.7, we have 


1 1 
2 7 aol 


zsq<y 
(q.N)=1 qeP 
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«> ~ 
Te 


1 
= loglog y — loglogz+ O (—) 
log z 


1 
= log(8/3)+ O (—) 
log z 
= O(1). 
Using this estimate and inequality (10.10), we have 
N 1 y log N 
logN 4+ g(q)log(N¥2/q) toa (log N) a y(q) log(N'/7/q) 


(q.N)=1 


< toe a xa) 


zsq<y 


(q,N)=1 
<< N 
(log N)?- 
Therefore, 
eYN 1 EN 
(F(s,) + €e!*)|A,| = —— —_—_______ +O (<a) , 
> 4 ” 4 Xu 9(q) log(N'/2/q) log N 
(q.N)=1 (q,N)=t 


We note that 


and 


1 1 
N ———_—_..— < N ee 
> q* log(N'/2/q) » 2 log N1/2/y 


(q.N)=1 254q<Y 4 


6N 1 
~ log N » ) 
& z<qxy 7 


N 
zlogN 
N7/8 
~ Jog N’ 


< 


Let 1 
S(t) = = = loglogt + B + 0 ( = ) 
d q logt 


and 


l 
LO Nien) 
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The functions S(t) and f(t) are increasing. We shall estimate the sum 


I 
1/2 


by using integration by parts twice in Riemann-Stieltjes integrals. We have 


l _ y dS(t) : y 
£4, qlog(N¥/2/q) — | log(N/2/t) — f(t)dS(t) 


y 
= f(y)S(y) — F(2S(2) — / s()df(t) 


= f(y)(log log y + B) — f(z)(log log z + B) 
— [coe logt + B)df (t) 


o(G)-o(f 
log z - logt 


= [ f(t)d loglogt +O ( 


1 
(log N)? ) 
We compute the integral explicitly by making the change of variable t = N“. Then 


y lor] y dt 
t)d t= _ 
| F(F)d log log ] tlogt log(N/2/1) 


1 1/3 da 
~ log N Jiyg_ (1/2) — @) 
_ 2log6 
~ log N’ 


Therefore, 


N 
log N 


> (F(sq) + €€")1Ag| = 
ae 


(< =ee + 01) 


and so 
NV(z) 


log N © 


Y > S(Aq, P,z) < 


2<q<y 


(ese at + 01) 
10.6 Anupper bound for S(B, ?, y) 


Theorem 10.6 


ce’ NV(z) e-'N 
S(B, P, y) < (S + o)) Tog N + O (a) . 


10.6 Anupper bound for S(B, P, y) 287 


Proof. Recall that 


B={N — pip2p3:2< Pi < y S Po S P3, Pip2p3 < N, (pip2ps, N) = 1}. 


Before estimating the sieving function S(B, P, y), we shall drop the requirement 
that (p,, N) = 1 and relax the condition that p;p2p3 < N so that the numbers 
pi and p2p3 range over intervals independent of each other. This will produce 
a “bilinear form” in p; and p2p3. We shall let the prime p; vary over pairwise 
disjoint intervals 

€<p, <(+e), 


where @ is a number of the form 
£=2z(l+e)* 
such that z < £ < y. Then 


— losty/z) _ logN 


0< 10.12 
~ ~~ jog(1 +e) E ( ) 

Let 

BO ={N — pipop3:2< pi<y< Po < Ps, 

<p, <(1+e)£, lprp3 < N, (p2p3, N) = 1} (10.13) 

and 

B=| | B®. 
U 
Then 


BCBC{N—p\pop3:z< Pi < y < Po < D3, PipoPs < (1+e)N}. (10.14) 


Let b(n), b(n), and b(n) be the characteristic functions of the sets B, B, and 
B, respectively. Since the sets B“ are pairwise disjoint, we have 


|B) = > BO 
£ 


and 
S(B, P, y) < S(B,P, y) =) S(B®, P, y). 
£ 


We shall estimate the sieving function 5(B™, P, y) by using Theorem 9.7 with 
the functions 


1 
d) = ¢,(d) = —~ 
g(d) = g,(d) od) 


for all n > 1, and with support level 


N1/2 
D = —————.. 
(log N)4 
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Then 
e 
By |= » L, 
P|} P2P3=N (mod d) 


2SP] <ySP2=p3.€sp) <(i+eje 
€p2 P3 <N,(p2 p3,N)=1 


and the error term r is defined by 


Bo 
1B) = |BY”| +r, 


In the next section, we shall prove that 


R® = > Ir?) « —_. 
2 Ma (log N)* 


d|P(y) 


(10.15) 


With this estimate for the remainder, Theorem 9.7 gives the upper bound 


N 
S(BO, P, F(s) + €e'*)|B©|V(y) + O | ——— }, 
( y) < (F(s) + €e ")|BY’|V(y) (log NY 
where log D 3 log log N 
p= 8 =~+0 0g te (1, 3] 
logy 2 log N 
and so, by Theorem 9.8, 
Y 
3 log N 
It follows from (10.3) that 
] 

VO) _ BBE (140 )-5+0( |). 

V(z) logy log N 8 log N 
This gives 

S(B®,P, y) 


fer 3,0(—1_)) p0 (<7) 
<( : +0()) (5+0(—)) IV) +0 ( aoa 


< & + 01) IBOIV(z) + O (<a) 
2 (log N)* ] 


Summing over the sets B, we obtain 


(€) er B en 
S(B, P, y) < 2B ,P,y)< (S + o()} |B|V(z)+O (5 


since the number of sets B® is O(e~! log N) by (10.12). 
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Next, we estimate |B\. By the prime number theorem, 


1 (ce) (1+2e)N 
Pip2 P1P2log(N/ pi p2) 


for N > N(e). If p, < p2 < p3, and pi p2p3 < (1+e)N, then pj p3 < (1+e)N 


and 
(l+e)N 
< ———., 


P1P2 
It follows from (10.14) that 
Bhs yd 1 


2SP| <YSP2SP3 
Py P2P3 <(i+E)N 


l+e)N 
sy (=) 
ZSP) <¥SP2 PiP2 


P| Ps <(1+E)N 


1 1 
< (1+2e)N > — ns 
z<pi<y P1 y<pr<((1+e)N/ pi)" P2 log(N /p1 p2) 


To estimate the inner sum, we introduce the functions 
h(t) = —————— 
log(N/ pit) 


and 


(N/u)'”? 1 
H(u) = _____d log logt. 
w) i] log(N/ut) 


The function A(t) is positive and increasing for 0 < t < N\/p1. Since y = N’”?, 
we have (N/y)!/2 = y and so H(y) = 0. Since z = N1/®, we have, with the change 
of variable t = N°, 


N7/16 


H(z) = ____dloglogt 
@) i oa"70) 08 08 
1/16 
-in |, aes 


“° (sy): 


1 
S(t) = > = loglogt + B+O (= —). 


p<t 


Recall that 


Applying integration by parts to the inner sum, we obtain 


1 


y<pr<((lse)N/p,'?2 P2 log(N/ Pi P2) 
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- yr My 


y<pr<((1te)N/p,)'2 P? 
((I+e)N/pi)'? 


h(t)d S(t) 


f 
oO" yi? 
y 


1/2 
h(t)d log logt + O (Repu) 
log y 


(N/p,)'? 
—#——d loglogt 
i log(N/pit)) = 


(Ite\W/py? 
+ ———#—dloglogt+0O ( 
Ko log(N/ pit) 50s 


] 
= (70+ 0 (Gram). 


1 
(log N)2 


The error term is obtained as follows. First, 


h((1 + €)N/py)'/*) 2 
logy log 


lA 


Second, with the change of variable t = (N/p)'/2s, 


((l+e)N/ps)'? 1 
——____——d log log t 
Joins log(N/pit) 


(te) /p,)!? 1 
- | —___* 
(N/p,)'?2 t log t log(N/ pit) 
(1+e)!/2 ds 
7 i slog ((N/p1)!/2s) log ((N/p1)!/25-1) 
(1+e)!/2 ds 
- I s(log ((N/p1)!/2) + log s)(log ((N/pi)'/?) — logs) 


i ds 
1s ((logv/p1)"2)” - dogs)?) 
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<K 1 i ds 
(log N)? Ji N) 


1 
-0(aam): 


It follows that the outer sum is 


H(p) ] H(p1) ( 1 
—_—_ + O ns O 
z<pi<y PI (>. pi(log =) d Pl ° (log NY? 


ZSpi<y 


where the error term comes from the fact that 


l 
> — =loglog y — loglogz + O((log zy!) 
ZSpi<y : 
= log(8/3) + O((log N)~') 
= O(1). 
We calculate the main term, as usual, by integration by parts: 


av - [10 \d S(u) 


ZSPi<y 


= [ H(u)d log logu + O (ee) 


logy 


y 1 
-| H(u)d log logu + O (aoxs) ) 


To evaluate the integral, we make the change of variables t = N“ andu = N B 
This gives 


NIA (N /u)'/? 
H(u)d log} = ——_———d log log td log 1 
[ (u)d log log u [.. I. ans ) og log td log log u 


V3 0-2 dadB 
“eer hy I, oB(l — a — B) 


/ log(2 — 3B) 
Teen oan Ihe pa—B) 


BQ — B) 
- iopn’ 
where 1/3 | 2-36) 
og(2 — 
= —_—__—_ dB = 0.363.... 
|. sana? 
Therefore, 


Bl < (1 + O(€))cN 7, ( N ) 
log N (log N)* 
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and 


(log NV)? 
ce’ NV(z) e Nn 
* (S ° 0.) logN *? (an) 


10.7 A bilinear form inequality 


We must still prove inequality (10.15) for the remainder R®. This will be a 
consequence of the following theorem. 


Theorem 10.7 Let a(n) be an arithmetic function such that |a(n)| < 1 for all n. 
Let A be a positive number, let X > (log Y)*4, and let 

_ (xY)'? 

~ (log Y)4" 


* 


Then 


Dame, | » a(n) — ph ham 


d<D* Z<p<Y n<X Zs<p<yY 
np=a_ (mod d) (np,d)=1 


XY (log XY)? 
AY (log XV)" (10.16) 
(log Y)4 
where the implied constant depends only on A. 
Proof. Let (a, d) = 1. By the orthogonality property of Dirichlet characters x 
(mod d), we have 


> Y(a)x(np) = | § g(d) ifnp=a (mod d) 


, othe otherwise. 
This gives 
a(n) _ 
3 > a(n) = > py od) yi X(a)x (np) 
n np=a_ (mod d) =P x m ) 
1 _ 
=—~ Yo xX@ > amxm) > x0). 
p( ) x (mod d) n<X Z<p<Y 


The contribution of the principal character yg (mod d) to this sum is 


1 
od) > > a(n). 


n<X Zs<p<yY 
(np,d)=1 
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It follows that the left side of (10.16) is bounded above by 


1 
7; y, Yd) x(p)|. 


d< D* x we Z<p<Y 


y | a(n)x(n) 


n<X 


Every character x (mod d) factors uniquely into the product of a primitive char- 
acter (mod r) and the principal character (mod s), where rs = d. Therefore, 
the sum can be written in the form 


» a5 Lm (mod 1) >. a(n)x(n) > xX(P) (10.17) 
rs<D* Zspsy 


] 
< —~ ) | 5 > a(n)x(n) y X(P)| 
s<D* y(s) r<D* ow x (modr) | n<X Z<p<Y 
X#X0 (1,5)=1 (p,s)=1 


where )| * denotes the sum over primitive characters (mod r). To obtain the last 
inequality, we used the fact that the Euler y-function satisfies g(rs) > g(r)g(s). 
We can estimate the character sum )_,_, x(p) by means of the Siegel—Walfisz 
theorem. We have 


Yix~s YS x@ DY 1 


<Y a (modr p<Y 
P ( ) p=a (modr) 


= )) x@x(¥;3r,a) 


a (mod r) 


wy °° (oars) 
= +O 
a d, r) xt € g(r) (log Y)% 


K rY 
(log Y)? 


p<Y 


since 


>, x@)=0 


a (mod r) 


for every nonprincipal character x. Since also 


rZ K ry 
(log Z)8 ~ (log Y)?’ 


YS x(p) « 


p<Z 


it follows by subtraction that 


Y> xP)«K& 


ry 
Z<p<Y (log Y)3 
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If we add the condition (p, s) = 1, we remove at most w(s) < logs < log D* 
terms from the character sum and so 


> xP) K& Gq 


Z<p<Y (lo 
(p,s)=1 


rY * 
2Y)2 + log D". 


Since |a(n)| < 1, we also have 


> ar)x(n)| < X. 


n<X 
(n,s)=1 


Let Dj be “small.” The inner sum in (10.17), restricted tor < Dj, is 


wh” ot X atoxin rw 


rX rY 
< —— | ——_—_- + log D*) 
x g(r) (as Y)8 
DP XY log D* 
(log Y)? 
The rest of the inner sum in (10.17) ranges over Dj < r < D*. We partition this 


interval into pairwise disjoint subintervals of the form Di < r < 2D/, where 
D* = 2* D* and 0 < k « log D*. This produces partial sums of the form 


> <a x oat Dd, a(ndx(n) >> x) 


(10.18) 


DY sr<2D} Z<p<Y 
Djs <r<D* in ae (p,5)=l 
1/2 
1 . r / 
<=  Vtew (() [amen 
1 Dit <r <2DF x7x0 g(r) n<X 
Do<r<0* (1,5)=1 


, \i2 
(5) [Lom 


(p.s)=1 


By Cauchy’s inequality, this sum is bounded above by 
>\ 1/2 


1 r 
—~ —~)*x ned r a(n) x(n) 
Dy Di <r<2D; g(r) 3 


(n, el 


9\ 1/2 


r 
— Dax won | DT x(P)) 


D*<r<2D} g(r) Z<p<Y 
(p,s)=1 
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The large sieve inequality [19, page 160] states that 


, L+M 2 L+M 
> —_—— > *y wn r) > a(n)x(n) < (R? + M) > la(n)|? 
r<R g(r) n=L+1 n=L+1 


for every arithmetic function a(n). Applying this inequality to each of the factors 
in the product, and using the condition that |a(7)| < 1, we obtain 


1 


y> >, oat » a(n)x(n)|| D> x(p) 
D¥sr<2DF g(r) “sesh 
D§sr<D* tn Ol (p,s)=l 


| 
Ka ((D*? + X)X)'”" ((D? + Y)v)"” 
1 


1 
= 1 +X+VY+—,> XY 
D 


XY 1/2 
« G +X yy? | (xy)? 


XY 1/2 
« G +X? 4 yl? 4 —) (xY)!/?. 
0 


Multiplying this by the number of partial sums, which is O(log D*), and adding 
(10.18), we obtain the following upper bound for the left side of (10.16): 


1 
y> — oe > Y> x(P) 


d<D* x ee Z<p<Y 


Y_a(n)x(n) 


n<X 


1 la, 
< »» 6) > oO) ot geen » a(n) x(n) dX x(P) 


tn, et (p,s)=1 


3 1 D?XY log D* 
4. o(s) (log Y)* 


1 (xyYy!/2 
+ —— ( D*+ X17 4 y'/2 4 ~~ *~_ | (xXV)!/ log D* 
> al 18 


s<D* Do 
D** XY (log D*)* 
(log Y)? 
XY 
+ (o" +X'Py yl? y a) (XY)'/? (log D*)’. 
0 


Note that we picked up a factor log D* from the estimate (Theorem A.17) 


> — & log D*. 


sept ? 
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Choose B = 4A and Dj = (log Y)4. Since X > (log Y)*4 and Y >> (log Y)*4, it 
follows that the left side of (10.16) is 


XY (log D*)? D* 1 1 1 ~ 
K H+ + — | XY (log D") 


(log Y)4 (XY)V/2 0 X12 yl/2— ps 
1 1 1 3 
XY (log XY)* 
(log Y)4 


This completes the proof. 
We can now derive the upper bound (10.15) for the remainder term 


RO = Sr, 


d<D 
d|P(y) 


where z < & < y. From the definition (10.13) of the sets B°, we obtain the 
individual error terms 


© _ | ple) L @ 
rp = [BP | — —— |B) 
g(a) 
l 
= YS  t-— OY 
ZSP} <¥SPIEP3 g(a) 2SpP] <YSP2=P3 
<p) <(I+eye esp, <(I+e)e 
€p2 p3 <N,(p2 p3.N)=1 €p2 P3<N,(p2 p3,N)=1 


P\P2P3=N (mod d) 


We delete some numbers from the second sum by adding the condition that 
(P1 p2p3, d) = 1. This is equivalent to (p;, d) = 1, since the condition (p2 p3, d) = 
1 already follows from the fact that d divides P(y). This additional condition 
decreases the second term by at most 


1 l+e)N 1 l+e)N N 
x i$ +€&) 3 2 (l+e)No(d) _ ( loga 
P(E)  npopctisen G4) ao az Pl zp(d) zp(d) 


pi \|d,p, 22 


lA 


Let a(n) be the characteristic function of the set of numbers of the form n = p2 p3, 
where y < p2 < p3 and (p2p3, N) = 1. Then we can write the error term in the 


form 
1 (N log d 
(£) 
in a(n) — ——~ a(n) +0 ( ). 
° > dX g(d) d d, zy(d) 
np=a_ (mod d) (np,d)=1 

where 
X=N/£ 
Y = min(y, (1 + &)£) 
Z = max(z, £) 


a=N., 


10.8 Conclusion 


Since £ < y, we have 


_ (XY)? 
~ (log ¥)4 
. N}/? min(y/£, (1 +6)! 
(log y)4 
N}/2 

> ee 

(log N)4 
= D. 


* 


Similarly, 
D* < (XY)! < (Ny)!? <N. 


By Theorem 10.7, 


RO = Voir? 


d<D 


d|P(y) 
< ir 
d<D* 
d|P(y) 
1 
d. d »» g(a) di 2, 
d|P(y) np=a_ (mod d) (np,d)=1 
N logd 
y (Sw) 
d<D* zo(d) 
d|P(y) 
“ XY(logXY)* N log D* 1 
(log Y)“ zx, g(d) 


< (log N)4~2 + N’ (log D*)’ 


N 
——___ + N’/8(Iog)? 
< (log N’)4 (log) 
< al 
(log N)* 


if we choose A = 6. This completes the proof. 


10.8 Conclusion 


We can now prove Theorem 10.1. 
Proof. It follows from the formula for V(z) in Theorem 10.3 that 


NV(z) _ 8e-"N 1 
logN ON) Toe N)2 ( +0 (<x)) 
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Theorem 10.2 gives a lower bound for r(4) in terms of three sieving functions. 
Using the estimates for these sieving functions in Theorems 10.4, 10.5, and 10.6, 


we obtain 


1 1 
r(N) > S(A, P, 2) ~ Y> S(Ag, P. 2) - = 5(B, P, y) — 2N78 — NV 


259 <y 
YNV 
> (2log3 —log6—c — 00) Fe 
- 
O EN _9Nn7/8 — N13 
(log N)° 


> (2log3 — log 6 — c — O(e)) G(N) 


-1 
O eT N —2N7/% — N13, 
(log N)° 


2N 1+O — 
woe ('*° (ix) 


Since 
2 log 3 — log6 —c = 0.042... > 0, 


we can choose € such that 0 < e < 1/200 and 
2 log 3 — log6 —c — O(e) > 0. 


For this fixed value of €, we have 
| 
o(£™_).o0f_*_). 
(log N) (log N)3 


2N 


This completes the proof of Chen’s theorem. 


Then 


10.9 Notes 


Chen [10, 11] announced his theorem in 1966 but did not publish the proof until 
1973, apparently because of difficulties arising from the Cultural Revolution in 
China. An account of Chen’s original proof appears in Halberstam and Richert’s 
Sieve Methods [44]. The proof in this chapter is based on unpublished notes and lec- 
tures of Henryk Iwaniec [67]. The argument uses standard results from multiplica- 
tive number theory (Dirichlet characters, the large sieve, and the Siegel—Walfisz 
and Bombieri—Vinogradov theorems), all of which can be found in Davenport [19]. 
Other good references for these results are the monographs of Montgomery [83] 
and Bombieri [3]. For bilinear form inequalities, see Bombieri, Friedlander, and 
Iwaniec [4]. 


Part III 


Appendix 


Arithmetic functions 


A.1 The ring of arithmetic functions 


An arithmetic function is a complex-valued function whose domain is the set of 
all positive integers. Let f and g be arithmetic functions. The sum f + g is the 
arithmetic function defined by 


(f + 8)(n) = f(n) + g(r). 
Addition of arithmetic functions is clearly associative and commutative, and every 


arithmetic function f has an inverse — f defined by (— f)(n) = — f(n). 
The Dirichlet convolution of the arithmetic functions f and g is defined by 


(f *g)(n)= > f(d)g(n/d). 


d|n 


It is easy to see that Dirichlet convolution is commutative, that is, f * g = g * f, 
and distributes over addition in the following way: 


fx(gthy=fxg+fxh 
The following theorem shows that Dirichlet convolution is also associative. 


Theorem A.1 If f, g, and h are arithmetic functions, then 


i *(g *h)=(f *g)*h. 
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Proof. For any n > 1, 


(Sf * 8) * hn) = UF * ey(a)h (=) 


d|n 


= 0 (f *)(d)h(m) 


dm=n 


= >> dS F)g(e)h(m) 


dm=n kl=d 


= >> f(eg(eh(m) 


k€m=n 


=) > fe) D> g@hm) 


k|n lm=n/k 


=f Y swn(S) 


k\n €\(n/k) 


= >> fe *h) (5) 


kin 


=(f *(g *h))(n). 


This completes the proof. 
We define the arithmetic function 6(n) by 


1 ifn=1, 
sin) = | 0 ifn>2. 


Then for any arithmetic function f we have 


(f *8)(n) = D> fd) (5) = FO, 


d|n 
and so the set of complex-valued arithmetic functions forms a commutative ring 


with identity element 6(n). This ring is an integral domain (Exercise 3). 
The product f - g of the arithmetic functions f and g is defined by 


(f - g)(n) = f(n)g(n). 


Let L be the arithmetic function L(n) = log n. Multiplication by L is a derivation 
on the ring of arithmetic functions, that is, 


L-(f*g)=(L- f)*gt+f *(L-g) 


(Exercise 11). 
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A.2 Sums and integrals 


In number theory, we often need to establish asymptotic formulas or at least good 
estimates for sums of the form 


My(x)= >> fn), 


n<x 


where f(n) is an arithmetic function. It is sometimes possible to estimate these 
“mean values” by integrals. 


Theorem A.2. Leta and b be integers witha < b, and let f(t) be a monotonic 
function on the interval [a, b]. Then 


b b 
min(f(a), FO) = °F) — | Feoat < max( fla), FO). 


k=a 
Proof. If f(t) is increasing on [a, b], then 


k+1 


f(k) < , f(t)dt 
fork =a,a+1,...,b—1, and 
k 
fi) > | fod 


fork =a+1,...,b. It follows that 


b b-1 b 
SrH=T fwo+s1@s f sewa+ sro 
k=a a 


k=a 

and 
b b—1 b 
Siw-Y fo+f@z | feodr+ fo, 
k=a k=a+l a 

Thus, 


b b 
fas fH)- | foat < FO 
k=a a 
Similarly, if f(t) is decreasing, then 
b b 
feos F0)- [fod < feo, 
k=a a 


This completes the proof. 
From this result, we get a useful estimate for n!. 
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Theorem A.3 For any positive integer n, we have 


n\n n\n 
e(-) <n! <en(=) . 
e e 


Proof. Since the function f(t) = logt is increasing on the interval [1, n], it 
follows from Theorem A.2 that 


n n 
logn! =) vlogn < [ logtdt +logn =nlogn —n+1+logn 
k=1 1 
and , 
logn! > | logtdt =nlogn—nv+l. 
1 


The result follows from exponentiating these two inequalities. 
Partial summation is another simple and powerful tool for computing sums in 
analysis and number theory. 


Theorem A.4 (Partial summation) Let u(n) and f(n) be arithmetic functions. 
Define the sum function 


U(t)= > un). 


n<t 
Let a and b be nonnegative integers witha < b. Then 
b 


b-1 
\~ u(n) f(n) = U(b) f(b) — Ua) fat) — Y> UMS + 1) — FO). 


n=a+tl n=a+l 


Let x and y be real numbers such that0 < y < x. If f(t) is a function with a 
continuous derivative on the interval [y, x], then 


Y umsoy=U@sa)-uoro)- | ves oat. 
y<nsx y 
In particular, if f(t) has a continuous derivative on (1, x], then 
Yum fn) = Use) [ U@s' Ode 


Proof. This is a straightforward calculation. 


b 
> un) f(n) 


n=a+l1 


b 
= )) (Ua) -U@- 1) fa) 


n=a+1 
b b—-1 
= \° UM@f(n) — DUM) f(n+}) 
n=a+l n=a 


b—1 


= U(b) f(b) — U(@ fat) — Y) UM Ff +1)— fm). 


n=a+l 
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If the function f(t) is continuously differentiable on [y, x], then 


n+1 
f(n+1)— f(r) = | fide 
and +] 
U(n)(f(n +1) — fin) = | U(t) f(t. 
Let a = [y] and b = [x]. Then 


Yun) f(n) 
y<n<x 
b 
= > umf) 
n=a+l 

b-1 

= U(b) f(b) - U@)f(a+l)— >) UMFm+)— fm) 
n=atl 


b-1 n+l 
-U@FO-UMsa+)- yf ves oad 


n=a+] 4" 


= U(x) f(x) — UQ) f(y) — UF) — fF) —UG)(Fa+ 1) — FO) 
_ | U(t) fat 
atl 


= U(x) f(x) — Uy) fy») - / Ut) f'(dt. 
y 


If f(t) is continuously differentiable on [1, x], then 


Youn) f(n) = ud f)+ Y> un) f(r) 


nsx l<n<x 


= u(l) f(1) + U@) f(x) — UC) (1) — / U(t) fat 
= U(x) f(x) | U(t) fiat. 


This completes the proof. 


Here is an application of partial summation. Recall that every real number x can 
be written in the form 


x = [x] + {x}, 
where [x] is the integer part of x and {x} is the fractional part of x. 


Theorem A.5 Let 
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Then 0 < y < land 
1 1 
y 5 - =logx+y +0 —}. 
ere x 
The real number y is called Euler’s constant . 


Proof. Since 0 < {t} < 1 for all t, we have 


 {t ~~ 1 
o<| Oat < | —dt = 1, 
1 0 , t? 


and so Euler’s constant y is a well-defined real number in the interval (0, 1). 
We apply partial summation with u(n) = 1 for all nm and f(t) = 1/t. Then 


U(t) =[t]=t — {t} 


and 


> : =) un) f(n) 


n<x n<x 


-24f Ba 
1 


x t2 


wie far 1D 
1 


Xx t 1 t? 


wtoge+1- fo Dare [ar 2 
1 x i 


t2 x 


1 
~logx + +0(~). 
x 


This completes the proof. 
As another application of partial summation, we obtain the Euler sum formula. 


Theorem A.6 (Euler sum formula) Let f(t) be a function with a continuous 
derivative on [y, x]. Then 


>, f@= | “fWOdt + R, 


y<n<x y 


where 
r- | s'@at+OF0)- wf) = f O(t) f (t)dt +O(y) f(y) — A(x) f@), 
y y 


where ' 
6(t) = {t}-— 5° 
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Proof. We apply partial summation with a(n) = 1 for all n. Then A(t) = [t] = 
t — {t} and 


>| fm) 


y<n<x 


= [x1 f(x) — IFO) — | (fat 
y 


- If) — bIfO) | if (dt + | {1} Fndt 
y 


y 


= [x] f@) — DIFO) - (xr - 0) - [ fiat +| {t} f'@adt 
y y 


i f(t)dt + | (1) f'(Odt + Ly). f0) — &1F@). 
y y 


This completes the proof. 
There is a simple expression for partial summation in terms of Riemann-Stieltjes 
integrals. If f and g are bounded functions on [y, x] and if f fdg exists, then 


f . gdf also exists and 


| fdg+ | edf = fx)e(x) — f(v)e(). 
y 


y 


This lovely reciprocity law is called integration by parts. (See Apostol [1, chapter 
9].) Let u(n) be a nonnegative arithmetic function, and let 


U(t)= > un). 


If f is continuous on [y, x], then 
FE ums =f feodue)-v@ Fe) - VOOFO - [ voaso. 
y<n<x y y 


If f has a continuous derivative on [y, x], then 
| U(t)df (t) = | U(t)f (@)dt, 
y y 


and we recover the formula for partial summation. Similarly, if we let 


U(t) = > u(n) 


n<t 


and if f is continuous on [y, x], then 


Y umf =vwse)-vooto)- | vod. Ad 


y<n<x y 
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A.3 Multiplicative functions 


An arithmetic function f(n) is multiplicative if 


f(mn) = fim) fm) 


whenever m and n are relatively prime positive integers. Since f(1) = f(1- 1) = 
f(1)*, we have f(1) = 1 or 0. If f(1) =0, then f(n) = f@- 1) = fi) fC) =0 
for alln > 1. Therefore, if the multiplicative function f is not identically zero, 
then f(1) = 1. 

If f and g are multiplicative functions, then the Dirichlet convolution f * g is 
multiplicative (Exercise 2). An arithmetic function f(n) is completely multiplica- 
tive if f(mn) = f(m) f(n) for all positive integers m and n. 


Theorem A.7 Let f be a multiplicative function. Then 


f (im, n})f((m, n)) = f(m) f (). 
Proof. Let pi, ..., p, be the prime numbers that divide m or n. Then 
m = I] D;' 
i=] 


and 


r 
= nj 
n= Pi > 


im] 


where r},...,7r,; Si,+--, 5, are nonnegative integers. Moreover, 
r 
max(7;,5;) 
[m,n] =| | 2; 
i=] 


and 


, 
(m,n) =|] pp”. 


i=] 
Since 
{max(7;, Si), min(7;, si)} = {ri, si} 


and since f is multiplicative, it follows that 


F (im, nl fm, ny) =] fer?) J rere”) 
i=] i=] 


=|] re? [| fo 
i=] i=] 


= f(m)f(n). 


This completes the proof. 
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The Mobius function j(n) is defined by 


1 ifn = 1, 
u(n) = 4 0 if n is divisible by the square of a prime, 
(—1)’ ifn is the product of r distinct primes. 


Thus, p(n) + 0 if and only if n is square-free, and 


u(n) = (-1) 


for square-free integers n, where w(n) is the number of distinct prime divisors of 
n. It is easy to check that the arithmetic function p(7) is multiplicative. 


Theorem A.8 Let f be a multiplicative function with f(1) = 1. Then 
> ud) f@) =|] [a - f()). 


d|n p\|n 


Proof. This is certainly true for n = 1. Forn > 1, let n* be the product of the 
distinct primes dividing n. Since w(d) = 0 if d is not square-free, it follows that 


>“ ud) f@) = > wu) f@) =] ]G - fF). 


d|n d|n* p\|n 
This completes the proof. 


Theorem A.9 Let f(n) be a multiplicative function. If 
lim f(p*)=0 
pk 00 
as p* runs through the sequence of all prime powers, then 
lim f(n) =0. 
n->OO 


Proof. There exist only finitely many prime powers p* such that | f(p*)| > 1. 

Let 
A= |] If(. 
If(P|21 
Then A > 1. LetO < € < A. There exist only finitely many prime powers p* such 
that | f(p*)| > &/A. It follows that there are only finitely many integers n such 
that 
If(p")| = e/A 


for every prime power p* that exactly divides n. Therefore, if n is sufficiently 
large, then n is divisible by at least one prime power p* such that | f(p*)| < €/A, 
and so n can be written in the form 


r+s r+stt 


n=T]o' [] oF TT e. 
i=] 


i=r+1 i=r+s+] 
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where pj,..-., Pr+s+¢ are pairwise distinct prime numbers such that 
1<(|f(pi'| fori =1,...,r, 


e/A<|f(pi|<1 fori=rt¢1,...,r+s, 


if(pil<e/A fori=rtstl,...,rt+s+t, 


and 
t>1. 
Therefore, 
r , r+s ' r+s+t 
Ifo =[ [rel PL @rt TT it@i < Ae/A)' < «. 
i=] i=r+] i=r+s+1 
This completes the proof. 


A.4_ The divisor function 


The divisor function d(n) counts the number of positive divisors of n. For example, 
d(n) = 1 if and only if n = 1, and d(n) = 2 if and only if n is prime. 


Theorem A.10 Let 


— ky ky 
m= Dp, -++ D 


be a positive integer, where p\,..., Py are distinct primes and k,,...,k, are 
nonnegative integers. Then 


d(m) = (k; + 1)---(k, + 1)n. 
Ifm and n are any positive integers, then 
d(mn) < d(m)d(n). 


If (m,n) = 1, then 
d(mn) = d(m)d(n), 


that is, the divisor function is multiplicative. 
Proof. Every divisor d of m can be written uniquely in the form 
d = p;' ++ per, 


where 
O<j <k; 
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fori = 1,...,7.Since there are k; + 1 choices of j; foreachi = 1,..., 7, it follows 
that 

n 
d(m) = | [ki +). 


i=l 
Let n be a positive integer, and let 


fy é, 
n= Pp, °° Dry 


where £,,..., £, are nonnegative integers. Then 
d(n) =| [Gi +). 
i=] 


Since 


and since 
k; + £; +1< (k; + 1)(é; + 1) 


for all nonnegative numbers k; and £;, it follows that 
d(mn) =| [i +4: +1) < | [G+ DG +1) =dem)dn). 
i=l i=l 
If (m,n) = 1, then k; = 0 or £; = 0 for eachi = 1,...,7. In this case, 
k,+€;+1= (k; + 1)(2; + 1) 


and 


d(mn) = ] [@ +£;+1)= | [@ +1) | [e: +1) =d(m)d(n). 
i=l i=l i] 
k; €; 7 


This completes the proof. 


Theorem A.11 
d(n) <, n° 


for every & > 0. 


Proof. Let f(n) = d(n)/n°. We shall prove that f() = o(1). Since the arithmetic 
functions d(n) and n° are multiplicative, it follows that f(n) is multiplicative, and 
so, by Theorem A.9, it suffices to prove that 


lim f(p*)=0. 
pk—>oo 
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Since (k + 1)/2**/2 is bounded for k > 1, we have 


This completes the proof. 


Theorem A.12 


D(x) =} / d(n) = xlogx + (2y — 1)x + O(/x). 


n<x 


Proof. We can interpret the divisor function d(n) and the sum function D(x) 
geometrically. In the uv—plane, 


d(n)=) 1=) 01 


d\n n=uv 


counts the number of lattice points (u, v) on the rectangular hyperbola uv = n that 
lie in the quadrant u > 0, v > 0. Then D(x) counts the number of lattice points in 
this quadrant that lie on or under the hyperbola uv = x, that is, the number of points 
(u, v) with positive integral coordinates such that 1 < u < x andl < v < x/u. 
These lattice points can be divided into three pairwise disjoint classes: 


l<u</x and 1l<v<VW/‘x, 


or 
l<u<J/x and Sx <v<x/u, 


or 
J/x<u<x and l<v<x/u. 


The last class consists of the lattice points (uw, v) such that 
l<vu</x and Sx <u<x/v. 


It follows from Theorem A.5 that 


Diw)=[vel + Do ([F]-[val)+ Lo ([E]- Evel) 


l<u<./x l<v</x 


A.4 The divisor function 


=[vxf+2 - -[V=}) 


l<u<J% u 
2 lal 
= —_ Vx}) 
> 2) 40 Wa) 
=2x \- 2 \- {= | ~x + O(Vz) 
I<u</x 1<u<./x 
2x (log ve +y +0 (—))-x+ 00 


=xlogx +(2y — 1)x + O(/x). 
This completes the proof. 


Theorem A.13 F 
> “ = ;(logx) + O(log x). 


n<=x 


Proof. It follows from Theorem A.12 that 


D(x) =) d(n) = x logx + O(x). 


n<x 


By partial summation, we obtain 


FAO PO, POG 
n x 1 t 


l O * tlogt t 
_ xlogx + 2 f og +O) 5 
Xx 1 t? 


* logt *1 
= log + 0(1) + | “eta +o(( | rat) 
1 1 


1 
= 5 (log x)” + O(log x). 


This completes the proof. 


Theorem A.14 
y | d(n) « x (log x)’. 


nox 


Proof. Since d(ab) < d(a)d(b) for all positive integers a and b, we have 


yd’ = y dn) > 1 


n<x n<x n=ab 
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) > d(ab) 


ab<x 


> d(a)d(b) 


ab<x 


= ) da) >> db) 


asx b<x/a 


- Yea) ((z) tos (5) +0 (2) 


a<x 


< rer DO +0 (sy) 


a<x 


lA 


<K x(log x), 


This completes the proof. 


A.5 The Euler g—function 


Let n > 1. We denote by g(7) the number of positive integers a < n such that 
(a,n)=1.Ifa =b (mod n), then (a, n) = (b, n), and so g(n) also counts the 
number of congruence classes modulo n that are relatively prime to n. This is 
exactly the order of the multiplicative group of units in the ring Z/nZ. 


Theorem A.15 The arithmetic function p(n) is multiplicative, and 
1 
vn) =n] (1 _ =): 
pin P 


Proof. Let (m,n) = 1, and let g(m) = r and y(n) = s. Let aj,...,a, and 
b;,..., bs be complete sets of representatives of the congruence classes relatively 
prime to m and n, respectively. We shall prove that the rs numbers ajn + b;m 
fori = 1,...,7 and j = 1,...,s form a complete set of representatives of the 
congruence classes relatively prime to mn. If 


ain+b;m=an+bygn (mod mn), 


then 
an+b;m=an+byen (mod n) 


and so 
bjm = bem (mod n). 


Since (m,n) = 1, we have 


bjm=bem_ (mod n). 
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This implies that j = 2. Similarly, we obtain? = k. Thus, the rs integers ajn+bj;m 
represent distinct congruence classes modulo mn. If (a;n+b;m, mn) > 1 for some 
i and j, then some prime p divides mn and ajn+b;m. Since (m, n) = 1, the prime p 
divides exactly one of m and n. If p divides m, then p divides a;n, and so p divides 
a;. This contradicts the fact that (a;, m) = 1. Therefore, (ajn + bjm, mn) = 1 for 
all i and j. 

We shall show that every congruence class relatively prime to mn is of this 
form. We note that (m, n) = 1 implies that the r integers a;n form a complete set of 
representatives of the congruence classes relatively prime to m, and the s integers 
b;m form a complete set of representatives of the congruence classes relatively 
prime to n. Let (c, mn) = 1. Then (c, m) = 1, and so 


c=ajn (mod m) 
for some i. Since 
(c,n) =(c —a;n,n)=1, 


it follows that 
c—ayn=b;m (mod n) 


for some j. Therefore, 
c=ajn+b;m (mod n) 


and 

c=ajn+b;m (mod m); 
hence 

c=ajn+bj;m (mod mn). 
Thus, 


g(mn) =rs = p(m)g(n). 


This proves that gy is multiplicative. If p is prime and k > 1, the only integers not 
prime to p* are multiples of p, and so 


7 1 
y(p*) = p* — p* '=pt(1-2). 


Therefore, 


1 1 
g(n)=| [or =[] 2 (1 - ~ | -»]T (1 - -): 
pein pk in P pin P 
k>] k>] 
This completes the proof. 
Theorem A.16 Let ¢ > 0. Then 


ni-§ < y(n) <n 


for all sufficiently large n. 
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Proof. It is clear that p(n) < n for all n > 1. We shall prove that 


l—e 


= 0. 


lim 
noo o(n) 


Since p/(p — 1) < 2 for every prime number p, we have 


m(1—e) m(1—e) 


P ee ne ae 


g(p™) p™—p™! p—1 p™ pr 


Therefore, 
m(1—e) 
lim 2 


pn—oo o(p”™) 
Since the arithmetic function n!~‘ /g(n) is multiplicative, the result follows from 
Theorem A.9. 


Theorem A.17 


n<x 


Proof. Let d* denote the square-free part of d, that is, 


p\d 
men 1 1 1\7' 121 
wnt] (-5) -a dear 
and so 
Daw 7 aba 
nsx PM) nM “ar 4 
M1 ol 
“aden 
-)°: yw 
- dal 4 mexjae UM 


“1 
K > aaa lox. 


The integers of the form dd* are precisely the integers that are square-full in the 
sense that if p divides d, then p* divides d for every prime p. We have 


ae M(t tar) 
- dd* > p? p? 


d=] 


A.6 The Mobius function 


This completes the proof. 


A.6 The Mobius function 


The fundamental property of the M6bius function is the following. 


Theorem A.18 
> Hd) = (0) = 


d|n 


1 ifn = 1, 
0 ifn > 2. 


Proof. This is clearly true for n = 1. If n > 2, then 


k 
r; 
n=] [27 


where k > 1, pi,..., Px are distinct prime numbers, andr; > 1 fori = 1,... 


Let )~’ denote a sum over square-free integers. Then 


> Hd) = Dowd) 


d\n d|n 


» #@) 


d|pi-- Pe 


» yn 


d|pi--: Px 


> ({)cv 
e=Q 4 
= (). 


This completes the proof. 
We define the arithmetic function 


l(n) = 1 
for all n. Then Theorem A.18 can be rewritten in the form 


px1l=s. 
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A nonempty set D of positive integers is called divisor-closed if whenever n € D 
and d divides n, then d € D. For example, the set N of all positive integers and the 
set of positive integers less than a fixed number z are examples of divisor-closed 
sets. The set of all divisors of a fixed positive integer is divisor-closed. If f and 
g are functions defined on a divisor-closed set D, then their Dirichlet convolution 
f * g is also defined on D. 


Theorem A.19 Let D be a divisor-closed set, and let f(n) be a function defined 
for alln € D. If g is the function defined on D by 


g(n)=>_ f(a), 
d|n 
then 
fn)= Yu (5) 8@) 
d|n 
foraline D. 
Conversely, let g be a function defined on D. If f is the function defined on D 
by 
fn)= You (5) e@, 
d|n 
then 
g(n)= > f(d) 
d\n 
foralln € D. 


Proof. If n € D and d|n, then d € D, since D is divisor-closed. Let 


g(n)=)_ f(d) 


d|n 
for n € D. Then 
g=f *l, 
and so 


4 (5) 8@=@* wm) 


d\n 
= ((f * 1) * u)(n) 
= (f * (1 * 1))(n) 
= (f * 6)(n) 
= f(n). 


Similarly, if 
n 
f(n)= Yiu (5) 8@) = @* wn), 


d|n 


A.6 The Mobius function 


then 


> £@ =(F * 1m) 


d|n 
= ((g * 4) * 1)(”) 
= (g * (u * 1))(7) 
= (g * 5)(n) 
= g(n). 


This completes the proof. 


Theorem A.20 Let f and g be arithmetic functions. Then 


g(n)= >) f(d) 


d\n 


if and only if 
fn) = You (5) ea. 


d|n 
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Proof. This follows immediately from Theorem A.19 with the divisor-closed 


set D equal to the set N of all positive integers. 


Theorem A.21 Let f(x) and g(x) be functions defined for all real numbers x = 1. 


Then 
g(x)= )) f(x/d) 
d<x 
if and only if 
f(x) = >> u@)g(x/d). 


d<x 


Proof. Let f be a function defined for all x > 1. If 


g(x)=)_ f(x/d), 


d<x 
then 
S > ud)g(x/d) = >> ud) D5 f(x/da’) 
ad<x d<x d'<x/d 
=) u@)f(«/dd’) 
dd’ <x 
= \> f(x/m) >> ud) 
m<x d|m 
= f(x). 


The proof in the opposite direction is similar. 
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Theorem A.22 Let D be a finite divisor-closed set, and let f and g be functions 


defined on D. If 
g(n)= >) f(d) 


deD 
n\d 


foralln € D, then 


fan)= ou (¢ | g(d) 


deD 
n|d 


for alln € D. Conversely, if 


fa)=)ou (¢ ) g(d) 


deD 
n\d 


foralln € D, then 
g(n)= ) > f(d) 


deD 
n\d 


foralin € D. 
Proof. This is a straightforward computation: 


yx(< = ) a(d)- ra; =) Fo 


nie keD 
djk 


= > uh) >> fk) 


nheD keD 
nh\k 
= Dow) DO fnne) 
nheD nhleD 
= >) far) > uh) 
nréeD heD 
hir 
= >) f(ar) )> w(h) 
nreD h|r 
= f(n). 


The proof in the opposite direction is similar. 


A.7 Ramanujan sums 


Let q and n be integers with g > 1. The exponential sum 


q 
x Bol) 


(a,q)=! 


(A.2) 
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is called the Ramanujan sum. These sums play an important role in the proof of 
Vinogradov’s theorem (Chapter 8). 


Theorem A.23 The Ramanujan sum c,(n) is a multiplicative function of q, that 
is, if (¢,q’) = 1, then 
Cgq'(N) = Cg(n)cg/(n). 
Proof. Since every congruence class relatively prime to qq’ can be written 


uniquely in the form ag’ +a’q with 1 <a <q,1<a’' <q',and(a,q)=(a’,q’')= 
1, it follows that if (¢, g’) = 1, then 


M 
M 


, 
q q (‘eeesen) 
= ée | -———— 
/ 
a=! q'=| qq 
(a,q)=1 (a’,q/)=1 
/ 
99 a''n 
ye el 
aq! =| qq 
(a”’.qq’)=1 


= Cgq'(n). 


Theorem A.24 The Ramanujan sum can be expressed in the form 


can) = w(5) d. 


d\(q,n) 
In particular, if (q,n) = 1, then 
Cq(n) = (4). 
Proof. Since ; 
fur)=Yre(F)={ 5 itn 


it follows that 


EE 
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Kua sre(2 
=) pwd) “( =) 
d\q far ~\G/d 
= >> ud) faja(n) 


d\q 


= >) uq/d) fan) 


d\q 


=> ug/d)d 


d\q 
d\n 


= Dd) wG/d)d. 


d\(n,q) 
If (gq, n) = 1, then c,(n) = “(q). 
Theorem A.25 The Ramanujan sum can be expressed in the form 


(q/(q,n))e@) 
Cg(n) = ——————. 
9(q/(q,n)) 
Proof. We define » 
q’=4/(q,N). 
If the prime p divides g but not q’, then p|(q, 7). It follows from Theorem A.15 
that 


gq) _ @M[Ipq(1 — 1/P) 


04’) 4’ TI pigt — 1/P) 
=(q.n)| [G -1/p) 
plq 
ply’ 
=(q,n) |] -1/p). 
pl(q.n) 
ply’ 


Then 


c(n)= Do (4)q 


d|(q.n) 


q a”) 
= pt ———_— d 
>, (a5 d 


= >) w(q'c)d 


cd=(q,n) 


>, #(q') Hod 


cd=(q,n) 
(q’,c)=1 


= > pu (q’) MO ed 


cd=(g,n) 
(q’,c)=1 
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= (q')(q.n) | | (1 ~ =| 


p\(q.n) 
py’ 


_ H@')9@) 
9(q’) 
This completes the proof. 


A.8 Infinite products 


This is a brief introduction to infinite products and Euler products. 
Let @1,@2,...,Q@,,... be a sequence of complex numbers. The nth partial 
product of this sequence is the number 


If as n tends to infinity, the sequence of nth partial products converges to a limit a 
different from zero, then we say that the infinite product []7-, «, converges and 


0O n 
] [ox = lim p, = lim | [a =a. 
k=] n->o&o n—- OO k=] 
We say that the infinite product diverges if either the limit of the sequence of partial 
products does not exist or the limit exists but is equal to zero. In the latter case, we 
say that the infinite product diverges to zero. 
Let 
ay = 1+ ax. 


If the infinite product Trea (1 + ay) converges, then a; + —1 for all k. Moreover, 


lim (1 +a;,) = lim —& =1, 
k-00 k-—00 Pkr—1 


and so 
lim a, = 0. 
k->00o 


Theorem A.26 Let a, > 0 for all k > 1. The infinite product []7-,(1 + ax) 
converges if and only if the infinite series )\?-, a, converges. 


Proof. Let s, = 7; a be the nth partial sum and let p, = []j2,(1 + ax) 
be the nth partial product. Since a, > 0, the sequences {s,} and {p,} are both 
monotonically increasing, and p, > 1 for all n. Since 


l+x<e* 
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for all real numbers x, we have 


O< So ay < a + ay) < [[e* = exp (S«) ; 


k=] k=] k=} k=1 


and so 


O<sS, < px <e™. 


This inequality implies that the sequence { p,,} converges if and only if the sequence 
{s,} converges. This completes the proof. 
We say that the infinite product []?~, (1 +a,) converges absolutely if the infinite 


product | 
so 
] [a+ lan 


n=] 


converges. 


Theorem A.27 [f the infinite product [|/-,(1 + an) converges absolutely, then it 
converges. 


Proof. Let 
Pr =| | +a) 


k=] 


and let 


P, = | [(1 + lag). 


k=] 


If the infinite product converges absolutely, then the sequence of partial products 
{P,,} converges and so the series 


oo 
“(Pa — Prt) 
n=2 

converges. Since 


O < [Pn — Pn-il 
= |@n Pn—1| 


an Ta + a,x) 


k=} 


n—-1 
< lanl | [G+ axl) 
k=] 
= |an|Pr—1 


= Pn — Pa-1; 
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it follows that: 


ore) 
> |Pn —~ Pn-1| 


n=2 


converges, and so 


oo n 
(Pn ~ Pn-1) = lim > (pr ~ Px-1) = lim (Pn _ P1) 
n=2 nO km? n—co 
converges. Thus, the sequence of partial products {p,} converges to some finite 
limit. 

We must prove that this limit is not zero. Since the infinite product [],2,(1 + 
a,) converges absolutely, it follows from Theorem A.26 that the series pel |a,| 
converges, and so the numbers a, converge to zero. Therefore, for all sufficiently 
large integers k, 


lL +a,| = 1/2 
and 
—a 
<2 
l+a,.| lax | 
It follows that the series 
oO 
—Ak 
lal 1+ ak 


converges, and so the infinite product 


IH (1- | 


k=l 


converges absolutely. This implies that the sequence of nth partial products 


z ar l 1 ] 
1 — —= ne SR TS omen 
IT ( “ ) I] l+a, []p-(l+ax) Dn 


k=] k=} 


converges to a finite limit, and so the limit of the sequence {p,} is nonzero. 
Therefore, the infinite product [];-, (1 + a.) converges. 

An Euler product is an infinite product over the prime numbers. We denote sums 
and products over the primes by ) > p and Il p» Tespectively. 


Theorem A.28 Let f(n) be a multiplicative function that is not identically zero. 


If the series 
> fn) 


n=] 


converges absolutely, then 


Sse =T] t+ s+ s0+--) =T] (14 10), 
Pp 


n=] P k=] 
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If f (n) is completely multiplicative, then 


> fa) =TJa-fey. 


n=] Pp 
Proof. If )-°-, (1) converges absolutely, then the series 
0O 
ap=>_ f(p*) 
k=l 


converges absolutely for every prime p. Also, the series 


Yo lael = 21>" Fe 
Pp 


Pp k=] 


<M ife 


P k=) 
fone) 


< ifm 


n=] 


converges, and so the infinite product 


[ [a+¢,)=[] (: +) ro) 
Pp P k=] 


converges absolutely. By Theorem A.27, this infinite product converges. 
Let € > 0, and choose an integer No such that 


>> If @)| <«. 


n>No 


For every positive integer n, let P(n) denote the greatest prime factor of n. Then 
> P(ny<w Genotes the sum over the integers all of whose prime factors are less 
than or equal to N, and }'>,,).y denotes the sum over the integers that have 
at least one prime factor strictly greater than N. Since the series > eo f(p*) 
converges absolutely for every prime number p, any finite number of these series 
can be multiplied together term by term. Let N > Np. It follows from the unique 
factorization of integers as products of primes that 


I] ( Ss) = )) f@) 
p<N k=] P(n)<N 


and so 


Yi fm@- > fa) 


n=] P(n)<N 


> fa) - [] (: +) > re) 
k=] 


n=] p<N 
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> f@) 


P(n)>N 


< > lfm 


P(n)>N 


<> lfm 


n>N 


< > If )| 


n>No 
< &. 


Therefore, 
, = ii 14> fp) =T] (1 SF (pey). 
yf tim 1 ( Sr ) IT( S70 ) 


If f(n) is completely multiplicative, then f(p*) = f(p)* for all primes p and all 
nonnegative integers k. Since f( p*) tends to zero as k tends to infinity, it follows 
that | f(p)| < 1. Summing the geometric progression, we obtain 


oo 0o ] 
1 ‘\=1 k= —__., 
+>) f(p*) +2 SM Fi 


k= 


and so 
I] (: +>) re) =[Ja-sfi)". 
Pp k=} Pp 
This completes the proof. 
A.9_ Notes 


All of the material in this chapter is basic elementary number theory. Compre- 
hensive standard references are the books of Hardy and Wright[51] and Hua [63]. 
Cashwell and Everett [8] proved that the ring of arithmetic functions is a unique fac- 
torization domain. Hardy’s book Ramanujan [46] contains a chapter on Ramanu- 
jan’s function c,(n) and its connection to the problem of representing numbers as 
sums of squares. 


A.10 Exercises 


1. Prove that 
> Md (n/k) = 1 


k|n 
for all n > 1. 
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10. 


Arithmetic functions 


. Prove that if f and g are multiplicative functions, then the Dirichlet convo- 


lution f * g is multiplicative. 


. Let f and g be arithmetic functions. Prove that if f * g = 0, then either 


f =0Oor g = 0. Thus, the ring of arithmetic functions is an integral domain. 


. An arithmetic function f(n) is additive if f(mn) = f(m) + f(n) for all 


positive integers m and n such that (m, n) = 1. An arithmetic function f (7) 
is completely additive if f(mn) = f(m) + f(n) for all positive integers m 
andn.Letn = p;'--- p;‘. We define the arithmetic functions w(n) and Q(n) 
as follows. The arithmetic function w() counts the number of distinct prime 
factors of n: 
w(n) =k. 
The arithmetic function §2(7) counts the number of prime factors of n with 
multiplicities: 
Q(n) =r, te +7,. 


Prove that w(n) is additive but not completely additive. Prove that {2(7) is 
completely additive. 


. Letn = p|'--- p,‘. Liouville’s function A(n) is defined by 


An) = (—1)°™ = (—1)y0t Fre, 


Prove that A(n) is completely additive. 


. Let f() be an arithmetic function. There exists a unique completely multi- 


plicative function f;(”) such that f,;(p) = f(p) for all primes p. Show that 
H1(n) = A(n). 


. Show that the functions j(n), y(n), and o,(n) are not completely multiplica- 


tive. 


. Prove that 


d(n) < 2° <n 


for every positive integer n. Prove that if n 1s square-free, then 


d(n) = 22 = 2%), 


. Prove that 


>a) > x(log x, 


n<x 


Hint: Apply the Cauchy-Schwarz inequality to )> ., d(n). 


nsx 


Let f be an arithmetic function. Prove that f is invertible in the ring of 
arithmetic functions if and only if f(1) = 1. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 
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Let f and g be arithmetic functions. Define the function L by 
L(n) = logn. 


Prove that pointwise multiplication by L(7) is a derivation on the ring of 
arithmetic functions, that is, 


L-(fx*g)=(L-f)*g+f*(L- 8). 


Let f and g be arithmetic functions with Dirichlet generating functions F(s) 
and G(s), respectively. Prove that F’(s) is the generating function for L - f 
and that (F(s)G(s))’ is the generating function for L - (f * g). 


Prove that 


q 
fxn) = doe (=) =) ca(n). 


a=] dlq 
Use Mobius inversion to deduce Theorem A.24 from this identity. 
Let 


a(n) = yo. 


d\n 


Prove that 
n<oa(n) <nlogn+ O(n). 


Hint: o(n) = doy, 7/4. 


Let p(n) be the Mobius function. Prove that 

5 An) I] (1 _ =) 
ns p p° 

for alls > 1. 


Prove that the Dirichlet convolution of arithmetic functions is associative, 
that is, if f(n), g(n), and h(n) are arithmetic functions, then 


(fxg)*h= fx (g *h). 


Let L(n) = logn for all n > 1. For any arithmetic function f, define L f 
by Lf(n) = L(n)f (n). Prove that L is a derivation on the ring of arithmetic 
functions, that is, 


L(f *g) =(Lf)* e+ f * (Lg). 
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18. Let f, g, and A be arithmetic functions. Prove that 


g(n)= ) > f(d)h(n/d) 


d\n 
if and only if 
f(n) =) w@)g(n/d)h(d). 
d\n 
19. Compute 
I] (:- K(k + we): 


20. Show that the infinite product 


00 _4)\k-1 
(1+ ) ) 
k=2 k 


converges, but not absolutely. 


21. LetO <b, < 1 foralln. Prove that if }-°, b, converges, then []>~, (1 — bn) 
converges. 


22. LetO <b, < 1 for all n. Prove that if )-7~ 
diverges to zero. 


1 Dn diverges, then []?72,(1 — dn) 
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